Statistics Notes

69
Chapter 1 INTRODUCTION TO STATISTICS WHAT IS STATISTICS? Introduction The word “statistics” appears to have been derived from the Latin word “Status”. Statistics was simple the collection of numerical data, by the kinds on different aspects useful to the state. Today statistics is the scientific study of handling quantitative information. It embodies a methodology of collection, classification description and interpretation of data obtained through the conduct of surveys and experiments. · Population The total group under discussion or the group to which results will be generalized is called population. For example collection of height measurements of all college students is called population. · Sample A part of the population selected in the belief that it will represent all the characteristics of the population is called a sample. For example a sample of 10 students is selected from a population of 100 students in order to analyse the average height of the students. · Meaning of Statistics Now a days the word “Statistics” is used in two senses i.e. Singular Senses In its singular sense, the word statistics means the science of statistics which deals with statistical methods. Plural Senses The word statistics, when used in plural senses means numerical facts collected in any field of study by using a statistical method. · Definition Of Statistics Statistic is the numerical statement of facts capable of analysis and interpretation, and science of statistics is the study of their principles and methods applied in collecting, presenting, analysis and interpretation the numerical data in any field of inquiry. OR Science of facts and figures is called statistics. OR (Croxton and Cowden) “Statistics are collection, presentation, analysis and interpretation of numerical data” OR (Connor) “Statistics are measurement enumeration or estimation or social or natural phenomena systematically arrange so as to exhibit interrelationship.” OR (Boodington) “Statistics is the science of estimates and probabilities.” OR (Achenwall) “Statistics are a collection of notes worthy facts concerning, both historical and descriptive.” OR “Statistics is defined as the science of collecting organizing presentation, analysis and interpreting numerical data for making better decisions. · Scope of Statistics Statistics is the branch of mathematics that deals with data. Statistics uses data, collected through systematically method of data collection and the theories are employed to arrive at the conclusion. · Main Branches of Statistics / Division of Statistics

description

Various Topics Statistical Notes covering various topics

Transcript of Statistics Notes

Page 1: Statistics Notes

Chapter 1 INTRODUCTION TO STATISTICS WHAT IS STATISTICS?IntroductionThe word “statistics” appears to have been derived from the Latin word “Status”. Statistics was simple the collection of numerical data, by the kinds on different aspects useful to the state. Today statistics is the scientific study of handling quantitative information. It embodies a methodology of collection, classification description and interpretation of data obtained through the conduct of surveys and experiments.· PopulationThe total group under discussion or the group to which results will be generalized is called population. For example collection of height measurements of all college students is called population.· SampleA part of the population selected in the belief that it will represent all the characteristics of the population is called a sample. For example a sample of 10 students is selected from a population of 100 students in order to analyse the average height of the students.· Meaning of StatisticsNow a days the word “Statistics” is used in two senses i.e.Singular SensesIn its singular sense, the word statistics means the science of statistics which deals with statistical methods.Plural SensesThe word statistics, when used in plural senses means numerical facts collected in any field of study by using a statistical method.· Definition Of StatisticsStatistic is the numerical statement of facts capable of analysis and interpretation, and science of statistics is the study of their principles and methods applied in collecting, presenting, analysis and interpretation the numerical data in any field of inquiry. ORScience of facts and figures is called statistics. OR(Croxton and Cowden)“Statistics are collection, presentation, analysis and interpretation of numerical data”OR(Connor)“Statistics are measurement enumeration or estimation or social or natural phenomenasystematically arrange so as to exhibit interrelationship.” OR(Boodington)“Statistics is the science of estimates and probabilities.” OR(Achenwall)“Statistics are a collection of notes worthy facts concerning, both historical and descriptive.” OR“Statistics is defined as the science of collecting organizing presentation, analysis and interpreting numerical data for making better decisions.· Scope of StatisticsStatistics is the branch of mathematics that deals with data. Statistics uses data, collected through systematically method of data collection and the theories are employed to arrive at the conclusion.

· Main Branches of Statistics / Division of StatisticsThe science of statistics may be classified into following two main branches.1. Statistical Method sIn case of statistical inquiry the last 1st step is the collection of data. Most of the data are complex and confused. So for a clear picture of this data, we reduced the complexity and confusion of this data, which is done by statistical method. Statistical method includes all the rules of procedure and techniques which are used in the collection, classification, tabulation, comparison and interpretation of data. Simply statistical data simplifies the complex of numerical data.2. Applied StatisticsIt deals with the application of statistical method to some specific problem; applied statistics has two types.a. Descriptive Applied StatisticsIn descriptive applied statistics it is applied to these data which relates to the present or past information for e.g. census in Pakistan for achieving certain conclusion.b. Scientific Applied StatisticsWe apply general rules on the quantitative data which is useful for forecasting and for this purpose we use scientific applied statistics.· Limitation of Statistics1. Statistical has a handicap in dealing with qualitative observation or values.

Page 2: Statistics Notes

2. Statistical results are applied only on the average3. Statistics does not study qualitative phenomena4. Statistical deals with fact which can be numerically expressed for e.g. love, hate, beauty, poverty health cannot be measured.5. Sufficient care need be exercised in the collection, analysis and interpretation of data otherwise statistical results may be false.· Use or Functions of Statistics1. Statistical simplifies the complicated data2. Statistical test the law of other sciences3. Statistics help a lot in policy making purposes4. Statistics and Forecasting5. Statistics and Administration6. Statistics helps in proper and efficient playing of a statistical inquiry in any field of studyRelationship of Statistics with other SciencesNow a days statistics and statistical data, methods being applied increasing in agriculture, Economics, Biology, Business, Physics, Chemistry, Astronomy, Medicine, Administration, Education, Mathematics, Meteorology and Physical science.1. Statistics and AdministrationStatistics plays an important role in the field of administration and management in providing measure of performance of the employees. Statistical data are widely used in taking all administration decision. For example the authorities want of rise the pay scales of employees in view of an increase in the cost of living. Statistical methods will be used to calculate the rise in cost of living.2. Statistics and AgricultureAgriculture Statistics cover a wide field. These include Statistics of land utilization, production of crop, price and wages in agriculture etc. Agriculture is greatly benefited by the statistical methods.3. Statistics and MedicineStatistics plays an important role in the field of medicine, to test the effectiveness of different types of medicines. Vital Statistics may be defined as the science. This deals with the application of numerical methods to vital fact. It is a part of the broader field of demography. Demography is a statistical study of all phases of human life relating to vital facts such as births, deaths, age, marriages, religions, social affairs, education and sanitation. Vital statistics is a part of demography and comprises of vital data.4. Statistics and MathematicsAll statistical methods have their foundations in mathematics. No calculating work can be done without the help of mathematics. Therefore, mathematics is applied widely in statistics. The branch of statistics is called mathematical statistic. Both these subjects are so interrelated.5. Statistics and Physical SciencesPhysical science greatly depending upon science of statistics in analysis and testing their significance for drawing result. Statistical methods are used in physical science like physics, chemistry, Geology etc.6. Statistics and EconomicsImportant phenomena in all branches of economics can be described, compared with the helps of statistics. Statistics of production described wealth of nation and compare it year after showing there by the effect of changing economics policies and other factors on the level of production.7. Statistics Helps in ForecastingThrough estimating the variables that exit in the fast forecasting about in times to come, can easily be done. Statistics helps in forecasting future events. Use of some statistical techniques like extrapolation and time series analysis helps in saying some thing the future courses of events. Statistics plays an important role in filed of astronomy, transportation, communication publics, health, teaching methods, engineering psychology, meteorology wealth forecasting. Statistics and BusinessStatistics plays in important role in business. It helps the business men to plan production according to the tastes of the customers; the quality of the products can also be checked by using statistical methods· Characteristic of StatisticsStatistics have the following characteristics.1. Statistics are aggregates of factsStatistics are a number of facts.A single fact even it numerically expressed, cannot called statistics.A single death, an accident etc, does not constitute statistics but on the other hand a number of deaths, accidents are statistics.2. Statistics are affected by many causesStatistics are aggregates of such facts only as grow out of variety of circumstances – their size, shape at any particular moments is the result of the action and interaction of forces.3. Statistics are numerically expressedIn statistics, we study quantitative expressions and not qualitative like old, young, good, bad etc.

Page 3: Statistics Notes

4. Statistics are estimates to a reasonable standardWhat standard of accuracy is to be regarded as reasonable will depends upon the aims and objects of inquiry and what so ever the standard of accuracy is once adopted it must be uniformly maintained throughout the inquiry.5. Statistics are collected in a systematic mannerStatistics collected in a haphazard manner can not be accurate.· Statistics InquiryThe inquiry about any problem which has done with the help of statistical principles and methods is called statistical inquiry.Steps in Statistical InquiryRequiring collection of data, the following steps are involved in statistical inquiry.1. Planning inquiry.2. Collection of data.3. Editing the collected data.4. Tabulating the data.5. Analyzing the data by calculated statistical measures.· Planning of Statistics InquiryFollowing are the factors of planning of statistical inquiry.1. Object and Scope of Inquiry2. Nature and Type of Inquiryi. Primary or Secondaryii. Census or Samplesiii. Open and Secretiv. Direct or Indirectv. Regular or Adhocvi. Initial or Receptivevii. Official, Semi Official or Non Official3. Statistical UnitThe unit of measurements which are applied in the collected of data is called statistical unit. For e.g. if we collect the rice crop according to per acre then it will be a statistical unit, for wheat crop there are two types of statistical unit.i. Physical Unitii. Arbitrary UnitsAdvantages of statistical Unitsi. If fulfills the object of inquiry stable.ii. Stable.iii. Homogeneous.iv. In obvious words.4. Degree of AccuracyThis decision about the nature of inquiry and purpose of investigation is called degree of accuracy.· VariableA measurable quantity which can vary (differ) from one individual to another or one object to another object is called variable. For e.g. height of students, weight of children. It is denoted by the letters of alphabet e.g. x, y, z etc.Type of VariableThere are many type of variable.1. Continuous VariableA variable which can take set of values (fractional) b/w two limits and has continuous integer numbers is called continuous variable. OrA variable which can assume any value within a given range is called a continuous variable. For e.g. age of persons, speed of car, temperature at a place, income of a person, height of a plant, a life time of a T.V tube etc.2. Discrete VariableA variable which can assume only some specific values within a given range is called discrete variable. For e.g. Number of students in a class, Number of houses in a street, number of children in a family etc. it can’t occur in decimal.3. Quantitative VariableA characteristic which varies only in magnitude from one individual to another is called quantitative variable. It can measurable. Or A characteristics expressed by mean of quantitative terms is known as quantitative variable. For e.g. number of deaths in a country per year, prices temperature readings, heights, weights etc.4. Qualitative VariableWhen a characteristic is express by mean of qualitative term is known as qualitative variable or an attributes. For e.g. smoking, beauty, educational status, green, blues etc. it is noted that these characters can not measure numerically.· DomainA set of value from which variables are taken on a value is called domain.

Page 4: Statistics Notes

· ConstantA characteristic is called a constant if it assumes a fixed value e.g. p is a constant with a numerical value of 3.14286. ℮ is also a constant with numerical value of 2.71828.· ErrorsThe difference b/w the actual values and the expected value is called errors. There are two types of errors.1. Compensating error2. Biased errors· DataA set of values or number of values is called data.· Quantitative DataThe data described by a quantitative variable such as number of deaths in a country per year, prices temperature readings, heights, weights, wheat production from different acres, the number of persons living in different houses etc, are called quantitative data.· Qualitative DataData described by a qualitative variable e.g. smoking, beauty, educational status, green, blue The marital status of persons such as single, married, divorced, widowed, separated, The sex of persons such as male and female, etc are called qualitative data.· Discrete DataData which can be described by a discrete variable is called discrete data. Number of students in a class, Number of houses in a street, number of children in a family etc· Continuous DataData which can be described by a continuous variable is called continuous data. For e.g. age of persons, speed of car, temperature at a place, income of a person, height of a plant, a life time of a T.V tube etc· Chronological DataA sequence of observations, made on the same phenomenon, recorded in relation to their time of occurrence, is called chronological data. A chronological data is also called a time series.· Geographical DataA sequence of observations, made on the same phenomenon, recorded in relation to their geographical region, is called a geographical data.· Statistical DataWhen the data is classified on the basis of a numerical characteristic which is know as statistical data on classification according to class interval. Statistical data may be classified is to two types1. Primary DataIt is most original data which is note complied by someone or it is first hand collected data. It has also not undergone any sort of statistical treatment.2. Secondary DataIt is that data which has already been compiled and analyzed by someone, may be sorted, tabulated and has undergone statistical treatment.· Collection of DataFollowing methods are used for collection of data.1. Methods for Collection of Primary DataFollowing are the main methods by which primary data are obtained.i. Direct Personal Investigationii. Indirect Investigationiii. Local Sourceiv. Questionnaire Methodv. Registrationvi. Questionnaire by Postvii. By Enumeratorsviii. By Telephoneix. Through Internet2. Methods for Collection of Secondary DataSecondary data may be obtained from the following sources.i. Official SourceFor e.g. publication of Statistical division, Ministries of food, Agriculture and Railways, Bureaus of Education, Finance, Provincial Bureaus of Statistics etc.ii. Semi – Official SourceFor e.g. State Bank of Pakistan, National Bank of Pakistan, WAPDA, District Councils Economics Research Institute, P.I.D.C, Central Cotton Committee etc.iii. Private Source

Page 5: Statistics Notes

For e.g. Publications of Trade Association Chambers of Commerce, Market Committee and industryiv. Research OrganizationFor e.g. University, other institute of education and Research, Irrigation Research Institute etc.v. Technical, Trade, Journals and Newspaper

Page 6: Statistics Notes

Chapter 3 MEASURES OF DISPERSION, MOMENTS AND SKEWNESSA quantity that measures how the data are dispersed about the average is called measures of

dispersion.

Range (R)The range is a simplest measure of dispersion. “It is defined as the difference b/w the largest and smallest observation in a set of data.” It is denoted by “R”. This is an absolute measure of dispersion.

For Ungrouped Data

Range = R =

Where = the largest value.

= the smallest value.

For Grouped Data

Range = R = Upper class boundary of the highest class – lower class boundary of the lowest classOr

Range = R = Class Marks (X) of the highest class – Class Marks of the lowest class

Semi Inter Quartile Range or Quartile DeviationThe semi inter-quartile range or quartile deviation is defined as half of the difference b/w the third and the first quartiles. Symbolically it is given by the

S.I.Q.R = Q.D =

Where = First, Lower quartile

= Third, Upper quartile

This is an absolute measure of dispersion.

Mean Deviation or Average DeviationThe mean deviation is defined as the average of the deviation of the values from an average (Mean, Median), the deviation are taken without considering algebraic signs.

1. Mean Deviation From Mean

For Ungrouped Data

M.D =

Or

M.D =

For Grouped Data

M.D =

Or

Page 7: Statistics Notes

M.D =

2. Mean Deviation From Median

For Ungrouped Data

M.D =

Or

M.D =

For Grouped Data

M.D =

Or

M.D =

Standard Deviation (S)The standard deviation is defined as the positive square root of the mean of the squared deviation of the values from their mean. Thus the standard deviation of a set of n values .it is denoted by ‘S’. This is an absolute measure of dispersion.

Methods of Standard DeviationI. Direct Method

II. Short Cut MethodIII. Coding Method or Step-Deviation Method

1. Direct Method

For Ungrouped Data

S.D = S =

S.D = S =

For Grouped Data

S.D = S =

S.D = S =

Page 8: Statistics Notes

2. Short Cut Method

For Ungrouped Data

S.D = S = Where D= X – A

For Grouped Data

S.D = S =

3. Coding Method or Step-Deviation Method

For Ungrouped Data

S.D = S = Where

For Grouped Data

S.D = S =

Combined Standard Deviation ( )

For two set of values

=

For three or more sets of data

=

Variance ( )The variance is defined as the mean of the squared deviation from mean. It is denoted by ‘ ’

OrThe square of the standard c=deviation is called variance. It is denoted by ‘ ’

Methods of Standard Deviation1. Direct Method2. Short Cut Method3. Coding Method or Step-Deviation Method

1. Direct Method

Page 9: Statistics Notes

For Ungrouped Data

Var(X) = =

Var(X) = =

For Grouped Data

Var(X) = =

Var(X) = =

2. Short Cut Method

For Ungrouped Data

Var(X) = = Where D= X – A

For Grouped Data

Var(X) = =

3. Coding Method or Step-Deviation Method

For Ungrouped Data

Var(X) = = Where

For Grouped Data

Var(X) = =

Combined Variance ( )

For two set of values

=

For three or more sets of data

Page 10: Statistics Notes

=

Relative Measure of Dispersion

1. Coefficient Of Range

Coefficient of Range =

2. Coefficient Of Quartile Deviation

Coefficient of Q.D =

Where = First, Lower quartile

= Third, Upper quartile

3. Coefficient Of Mean Deviation From Mean

Coefficient of M.D from Mean =

Or

Coefficient of M.D from Mean =

4. Coefficient Of Mean Deviation From Median

Coefficient of M.D from Median =

Or

Coefficient of M.D from Mean =

5. Coefficient Of Standard Deviation

Coefficient of S.D =

6. Coefficient Of Variation (C.V)“The coefficient of variation expresses the standard deviation as a percentage in terms of arithmetic mean”. It is used as a criterion of consistent performance, the smaller coefficient of variation, and the more consistent in the performance.

Or“Coefficient of variation is used to compare the variability of two or more than two series”.

Coefficient of Variation = C.V =

Relationship Between Measures of Dispersion

Page 11: Statistics Notes

1. For Normal Distribution

I. Mean Deviation = M.D = 0.7979 S.D

II. Quartile Deviation = Q.D = 0.6745 S.D

2. For Moderately Skewed Distribution

I. Mean Deviation = M.D = S.D

II. Quartile Deviation = Q.D = S.D

III. Quartile Deviation = Q.D = M.D

MomentsA moment designates the power to which deviation are raised before averaging them.

Methods of Standard Deviation1. Moments about Mean or Central Moments2. Moments about Origin or Zero3. Moments about Provisional Mean or Arbitrary Value (Non Central Moment)1. Moments about Mean or Central Moments

For Ungrouped Data

For Grouped Data

Page 12: Statistics Notes

2. Moments about Origin or Zero

For Ungrouped Data

For Grouped Data

3. Moments about Provisional Mean or Arbitrary Value (Non Central Moment)4.

Methods of Standard Deviationi. Direct Method

ii. Short Cut Methodiii. Coding Method or Step-Deviation Method

i. Direct Method

For Ungrouped Data

Where A is constant

Page 13: Statistics Notes

For Grouped Data

Where A is constant

ii. Short Cut Method

For Ungrouped Data

Where D= X - A

For Grouped Data

Where D= X - A

iii. Coding Method or Step Deviation Method

For Ungrouped Data

Page 14: Statistics Notes

Where

For Grouped Data

Where

Relation Between Central moments in Terms of Non Central Moments

Moments – Ration

Sheppard’s Correction for Moments of Group Data

Page 15: Statistics Notes

Charliers Check

i.

ii.

iii.

iv.

SymmetryIn a symmetrical distribution a deviation below the mean exactly equals the corresponding deviation above the mean. It is called symmetry.For symmetrical distribution the following relations hold.

Mean = Median = Mode

- Median = Median -

SkewnessSkewness is the lack of symmetry in a distribution around some central value i.e. means Median or Mode. It is the degree of asymmetry.

Mean Median Mode

- Median Median -

There are two types of Skewness.1. Positive Skewness

If the frequency curve has a longer tail to right, the distribution is said to be positively skewed.2. Negative Skewness

If the frequency curve has a longer tail to left, the distribution is said to negatively skewed.

Coefficient of Skewness (SK)

Karl Pearson’s Coefficient of Skewness

SK =

Page 16: Statistics Notes

SK =

Bowly’s Quartile Coefficient of Skewness

SK =

Moment Coefficient of Skewness

SK =

KurtosisMoment coefficient is an important measure of kurtosis. These measures define as:

The moment coefficient is a pure numbers and independent of the origin and unit of measurement.

If distribution is Leptokurtic

If distribution is Normal or Mesokurtic

If distribution is Platy KurticOr

For Normal distribution, K = 0.263

Chapters 4 INDEX NUMBERS

Index Numbers“A relative number which indicates the relative change in a group of variables collected at different time”. Index numbers is a device for estimating trend in Prices, Wages, Production and other economic variables. It is also known as economic barometer.

Or

An Index Number Is a number that measure a relative change in a variable or an average relative change in a group of related variable with respect to a base. A base may be that particular time, space professional class with whose reference change are to be measured .

Type of Index NumbersThere are three types of index numbers which are commonly used.

Page 17: Statistics Notes

1. Price Index NumbersA price index number is a number that measures the relative change in the price of a group of commodities with respect to base.

2. Quantity Index NumbersThese index numbers measure the changing the volume or quantity of goods produced or consumed.

3. Aggregative Index NumberThese index numbers are used to measure changes in a phenomenon like cost of living, total industrial production etc.

Classification of Index NumbersIndex number generally classified as 1. Simple index number.2. Composite index number.

1. Simple index number A simple index numbers that measure a relative change in a single variable with respect to a base these variables are Prices, Quantity, Cost of Living etc.

OrIf an index is based on single variable only than it is know as simple index number. For example of index no prices of Banaspati Ghee, index number of carpets exported to the Middle East etc.

1.1. Fixed Base Method

Price RelativeThey are obtained by dividing the price in a current year by the price in a base year and expressed as percentage.

Price relative =

Where =Current year prices , =Base Year prices

Quantity RelativeThey are obtained by dividing the quantity in a current year by the quantity in a base year and expressed as percentage.

Quantity Relative=

Index Number

Simple Composite

Un WeightedWeighted

Aggregative

Average of relative

Aggregative

Average of relative

Page 18: Statistics Notes

Where =Current year quantities , =Base year quantities

1.2. Chain Base Method In this method index number is computed in two steps. As a first step, we calculate link relative by dividing current period price/quantity/value by price/quantity/value of immediate previous period of current and expressing this ratio in percentage.

Link Relative (Price) =

Link Relative (Quantity) =

In second step, we take just reverse step of step 1. Hence, to get chain indices we multiply the current period link relative by link relative of immediate previous period of current period and divide this product by 100.

Chain Indices =

2. Composite Index Numbers It is number that measures an average relative change in a group of related variables with respect to a base.Composite Index number are further classified as 2.1. Un-weighted composite Index Number2.2. Weighted Composite Index Number

2.1. Un-weighted composite Index Number In un-weighted index numbers the weights are not assigned to various items. The following methods are generally used for the construction of un-weighted index number.

a. Simple Aggregative Price Index As we are aware that in calculation of composite index number we are always given two or more commodities prices, quantities. So in simple aggregative method we take year wise total of the involved commodities and then adopt fixed base or chain base method as may be the case.

Fixed Base MethodUnder this method to construct price/quantity index the total of current year prices/ quantities of various commodities in question is divided by total of base year prices/quantities and the result is expressed in percentage. Symbolically:

and

Chain Base MethodUnder this method as first step we compute link relative for each year by dividing current year total of prices/quantities by the immediate previous year total of prices/quantities and expressing the result in percentage. To get chain indices we take the reverse procedure as we take in calculating link relative i.e. we multiply each year link relative by previous year link relative and divide this product by 100

b. Simple Average of Price Relatives Simple Average of Relatives method is further sub-divided into two methods:

Fixed Base Method

Page 19: Statistics Notes

Under simple average of relatives by fixed base method, first of all we find price/quantity/value relatives for each commodity given in the problem and then average these relatives by using arithmetic mean, median, and geometric mean. The resulted averages are known as index numbers by simple average of relative method.

Chain Base MethodNow, we will discuss simple average of relatives by chain base method. Under this method first of all we find link relatives for the given commodities, as a 2nd step, we take average (arithmetic mean, median, geometric mean) of link relatives. In 3rd step we find chain indices by adopting the same procedure as we take in chain indices for single commodity index number.

2.2. Weighted Composite Index Number In weighted index numbers, the weights are assigned in proportion to the relative importance of different commodities included in the index. Weighted index numbers are of two types.

a. Weighted Aggregative Indices These indices are just like the simple aggregative indices but which basic difference that weights are assigned to various commodities included in the index. There are various methods of assigning weights. Various formulas for constructing index numbers have been devised of which some of the most important ones are given below.

Price Index Numbers

(1). Laspeyre’s Price Index = (Base Year Weighted)

(2).Paasche’s Price Index = (Current Year Weighted)

(3).Marshall-Edgeworth Price Index =

(4).Fisher’s Ideal Price Index =

(5).Walsh Price Index =

Quantity Index Numbers

(1).Laspeyre’s Quantity Index = (Base Year Weighted)

(2).Paasche’s Quantity Index = (Current Year Weighted)

(3).Marshall-Edgeworth Quantity Index =

(4).Fisher’s Ideal Quantity Index =

Page 20: Statistics Notes

(5).Walsh Quantity Index =

b. Weighted Average of Relative Indices Under this method we attach weights to price relatives or quantity relative. Thus, first we find price or quantity relatives in the same way as we find simple average relatives but now we will take weighted average for averaging calculated relatives.The important types of weighted average of relatives are given below:

Price Index Numbers

(1). Laspeyre’s Price Index = (Base Year Weighted)

Where Price Relative= ,

(2). Paasche’s Price Index = (Current Year Weighted)

Where Price Relative= ,

(3). Palgrave’s Price Index =

Where Price Relative= ,

Quantity Index Numbers

(1).Laspeyre’s Quantity Index = (Base Year Weighted)

Where Quantity Relative= ,

(2).Paasche’s Quantity Index = (Current Year Weighted)

Where Quantity Relative= ,

Page 21: Statistics Notes

(3).Palgrave’s Quantity Index =

Where Quantity Relative= ,

Uses of Index Numbers

1. The price index numbers are used to measure change in the price of a commodities. It helps in comparing the changes in the prices of one commodity with another.

2. The quantity Index number is used to measure the change in quantities produced, Purchased, Sold etc.

3. The Index numbers of industrial production are used to measure the changes in the level of industrial production in the country.

4. The index number is used to measure the change in enrolment of performance etc.

5. The index numbers of import prices and export prices are used to measure the change in the terms of trade of a country.

6. The index numbers are used to measure seasonal variation and cyclical variation in a time series.

7. The index numbers measure the purchasing power of money and determine the real wages.

Limitations of Index Numbers

1. All index numbers are not suitable for all purposes. They are suitable for the purpose for which they constructed.

2. Comparisons of changes in variables over long period are not reliable

3. Index numbers are subject to sampling error.

4. It is not possible to take into account all changes in quality or product.

5. The index numbers obtained by different methods of construction may give different results.

Consumer Price Index Numbers (CPI)“Consumer Price Index numbers are intended to measure the changes in the prices paid by the consumer for purchasing a specified “basket” of goods and services during the current year as compared to the base year”. The basket of goods and services will contain items like Food, House Rent, Clothing, Fuel and Light, Education, Miscellaneous like, Washing, Transport, Newspapers etc Consumers price index numbers are also called cost of living index numbers or Retail price index numbers.

Wholesale Price Index Numbers (CPI)Wholesale Price Index number is constructed to measure the change in prices of products produced by different sectors of an economy and traded in wholesale markets.The sector covered under this index are agriculture, industry etc.Federal Bureau of Statistics is also engaged in constructing this index by using weighted average of price relatives. Almost all the steps discussed in the topic of “steps involved in the construction of an index number” are taken into consideration for constructing this index

Construction of Consumer Price Index NumbersThe following steps are involved in the construction of consumer price index numbers.

1. Scope

Page 22: Statistics Notes

The first step is to clearly specify the class of people and locality where they reside. As far as possible a homogeneous group of persons regarding their income and consumption pattern are considered. These groups may be school teachers, industrial workers, Officers, etc residing in a particular well defined area.

2. Household Budget Inquiry and Allocation of Weighs The next step is to conduct a household budget inquiry of the category of people concerned. The object of conducting a family budget inquiry is to determine the goods and services to be included in the construction of index numbers. This step has many practical problems as no two household have the same income and consumption pattern. Therefore, the inquiry should include questions on family size, number of earners, the quantity and quality of goods and services consumed and money spent on them under various headings, such as: clothing and footwear, fuel and lighting, housing, misc. etc. the weights are then assigned d to various groups in proportion to the money spent on them.

3. Collection of Consumer Prices The collection of retail is a very important and at the same time very difficult task because the prices may vary from place to place and from shop to shop. The prices of the selected items both for the given and base period are obtained from the locality where the people reside or from where they make their purchase.

4. Method of Compilation of Consumer Price Index Numbers After construction of consumer price index number we compiled in any one of the following methods.

Aggregative Expenditure MethodIn this method quantities are consumed as base year taken as weights. If and be the price and

quantity of base period and & be the price and quantity of given year then

Where = Aggregative expenditure in the given year.

= Aggregative expenditure in the base year

Household Budget method Or Family Budget MethodIn the method the amount of expenditure by the household on various items in the base period. If and

be the prices of base and given year and weight where so quantity of base period then

Where Price Relative= ,

Theoretical Tests for Index NumbersFrom a theoretical view point, a “good” index number formula is required to satisfy the following tests by lrving Fisher (1867-1947)

1. Time Reversal Test This test may be stated as follows:- “If the time subscripts of a price (or quantity) index number formula be interchanged, the resulting price (or quantity) index number formula should be the reciprocal of the original formula”. i.e.

or

As we will just see, Fisher’s and Marshall-Edgewoth index numbers satisfy the Time Reversal Test.Laspeyres’s and Paasche’s index numbers not satisfy the Time Reversal Test.

Page 23: Statistics Notes

2. Factor Reversal Test This test may be stated as follows:- “If the factors prices and quantities occurring in a price (or quantity) index number formula be interchanged so that a quantity (or price) index formula is obtained, then the product of the index numbers should give the true value index number”. i.e.

As we will just see, Only Fisher’s index number satisfies the Factor Reversal Test.Laspeyres’s , Paasche’s and Marshall-Edgewoth index numbers not satisfy the Factor Reversal Test.

3. Circular Test This test may be stated as follows:- “ If an index for the year ‘b’ based upon the year ‘a’ is and for the year ‘c’ based upon the year ‘b’ is

, then the circular test requires that the index for the year ‘c’ based upon the year ‘a’, i.e., should be the same as if it were compounded of these two stages” i.e.

As we will just see, Laspeyres’s , Paasche’s , Fisher’s and Marshall-Edgewoth index numbers satisfy the Circular Test.

Chapter 5 REGRESSION & MULTIPLE REGRESSION

RegressionThe dependence of one variable upon the other variable is called regression. For example, weights depend upon the heights.

Page 24: Statistics Notes

OR

Regression is a mathematical relationship b/w one dependent and one independent variable.For example, Demand depends upon price. When price is independent variable and demand is dependent variable.

Linear RegressionWhen the dependence of the variable is represented by a straight line, then it is called the linear regression otherwise it said to be non-linear or curvilinear regression. For example, If X is independent variable and Y is dependent variable, then the relation Y=a+bX is called linear regression.

Properties of Least Square Regression line or Regression line

1. The least square regression line always passes through the mean values i.e. ( ).

2. Regression Coefficient i.e. b , d always have the same size.

3. The sum of deviation from observed as estimated values is always equal to zero. i.e. ,

4. The sum of square deviation b/w observed estimated values always minimum. i.e. = minimum

, = minimum

5. Sum of trend values always equal to sum of observed values. i.e. ,

Types of Linear Regression / Regression EquationsRegression equations are the algebraic expressions of the regression lines. There are two regression equations, because there are two regression lines. These are:

1. Regression Equations of Y and X2. Regression Equation of X and Y

1. Regression Equations of Y and X or Y on X b is the regression coefficient of regression line Y on X.

Liner Regression / Regression line / Least square regression line

Or

Or

General Method

Normal Equations

Page 25: Statistics Notes

We get the value of “a” and “b” solving the above equations simultaneously.

Alternative MethodsDirect formula of obtaining the value of “a” and “b”

Direct formula of “a”

(1).

(2).

(3).

Direct formula of “b”

(1).

(2).

(3).

(4).

(5). When

(6).

Page 26: Statistics Notes

(7).

Where A= Constant

B= Constant

(8).

(9).

Where

,

,

2. Regression Equation of X and Y or X on Y

d is the regression coefficient of regression line X on Y.

Liner Regression / Regression line / Least square regression line

Or

Or

General Method

Normal Equations

We get the value of “c” and “d” solving the above equations simultaneously.

Alternative Methods

Page 27: Statistics Notes

Direct formula of obtaining the value of “c” and “d”

Direct formula of “c”

(1).

(2).

(3).

Direct formula of “d”

(1).

(2).

(3).

(4).

(5). When

(6).

(7).

Where A= Constant

Page 28: Statistics Notes

B= Constant

(8).

(9).

Where

,

,

Scatter DiagramIf we plot the paired observation on a graph, the resulting set of points is called a scatter diagram.

Standard Deviation of Regression or Standard Error of EstimateTo observed values of (X, Y) do not all fall on the regression line but they scatter away from it. The degree of scatter (or dispersion) of the observed values about the regression line is measured by what is called the standard deviation of regression or the standard error of estimate of Y on X and X on Y.

1. Y on X ( )

For Ungrouped Data

Or

X-axis

Y-axis

0

Page 29: Statistics Notes

Where is Trend values

For Grouped Data

Where k is constant

2. X on Y ( )

For Ungrouped Data

Or

Where is Trend values

For Grouped Data

Where h is constant

Multiple RegressionA regression which involves two or more independent variable is called a multiple regression. For example; the yield of a crop depends upon fertility of the land, fertilizer applied, rain fall, quality of seeds etc. likewise, the systolic blood pressure of a person depends upon one’s weight, age, etc

Multiple Liner Regression With Two Independent Variables

Multiple Regression Line

The estimated multiple liner regression based on sample data is

Normal Equations are

We get the value of “a”, “ ” and “ ” solving the above equations simultaneously.

Page 30: Statistics Notes

Type of Multiple Liner Regression / Multiple Regression Equations With Two Independent VariablesMultiple Regression equations are the algebraic expressions of the regression lines. There are two regression equations, because there are two regression lines. These are:

1. Multiple Regression Equations of on and

2. Multiple Regression Equation of on and

3. Multiple Regression Equation of on and

1. Multiple Regression Equations of on and

and is the regression coefficient of Multiple regression line on and .

Multiple Liner Regression / Multiple Regression line / Least Square Multiple Regression line

Or

General Method

Normal Equations

We get the value of “a”, “ ” and “ ”solving the above equations simultaneously.

Alternative MethodsDirect formula of obtaining the value of “a”, “ ” and “ ”

Direct formula of “a”

Direct formula of “ ”

(1).

(2). Where r=correlation

(3).

Direct formula of “ ”

Page 31: Statistics Notes

(1).

(2). Where r=correlation

(3).

Where

Direct Method to Solve Multiple Regression equation on and

Or

2. Multiple Regression Equations of on and

and is the regression coefficient of Multiple regression line on and .

Multiple Liner Regression / Multiple Regression line / Least Square Multiple Regression line

Or

General Method

Normal Equations

We get the value of “a”, “ ” and “ ”solving the above equations simultaneously.

Alternative Methods

Page 32: Statistics Notes

Direct formula of obtaining the value of “a”, “ ” and “ ”

Direct formula of “a”

Direct formula of “ ”

(1).

(2). Where r=correlation

(3).

Direct formula of “ ”

(1).

(2). Where r=correlation

(3).

Where

Direct Method to Solve Multiple Regression equation on and

Or

3. Multiple Regression Equations of on and

and is the regression coefficient of Multiple regression line on and .

Multiple Liner Regression / Multiple Regression line / Least Square Multiple Regression line

Page 33: Statistics Notes

Or

General Method

Normal Equations

We get the value of “a”, “ ” and “ ” solving the above equations simultaneously.

Alternative MethodsDirect formula of obtaining the value of “a”, “ ” and “ ”

Direct formula of “a”

Direct formula of “ ”

(1).

(2). Where r=correlation

(3).

Direct formula of “ ”

(1).

(2). Where r=correlation

(3).

Where

Page 34: Statistics Notes

Direct Method to Solve Multiple Regression equation on and

Or

Chapter 6 CORRELATION , MULTIPLE AND PARTIAL CORRELATION

CorrelationThe interdependence of two or more variables is called correlation.

OrThe liner relationship b/w two or more variables is called correlation. For example, an increase in the amount of rainfall will increase the sales of raincoats. Ages and weights of children are correlated with each other.

Positive CorrelationThe correlation in the same direction is called positive correlation. If one variable increase other is also increase, and one variable is decrease other is also decrease. For example, an increase in heights of children is usually accompanied by an increase in their weights. The length of an iron bar will increase as the temperature increase.

Negative CorrelationThe correlation in opposite (different) direction is called negative correlation. If one variable increase other is decrease, and one variable is decrease other is increase. For example, the volume gas will decrease as the pressure increase.

No Correlation Or Zero CorrelationIf there are no relationship b/w two variables then it is called no correlation or zero correlation.

Coefficient of CorrelationIt is a measurement of the degree of interdependence b/w the variable. It is a pure number and lies b/w -1 to +1 and intermediate value of zero indicates the absence of correlation. it denoted by r.

Properties of Correlation Coefficient 1. The correlation coefficient is symmetrical with respect to X and Y i.e. =

2. The correlation co-efficient is the geometric mean of the two regression coefficients.

Or

3. The correlation coefficient is independent of origin and unit of measurement i.e.

4. The correlation coefficient lies b/w -1 and +1.i.e. 5. It is a pure number.

Formulas of Correlation Coefficient

For ungrouped Data

Page 35: Statistics Notes

(1).

(2).

(3).

(4).

(5).

(6).

Where ,

(7).

Where ,

(8). Or

Where ,

For Grouped Data

(1).

Page 36: Statistics Notes

(2).

(3).

(4). Or

Where ,

,

Rank CorrelationSometimes, the actual measurement or counts of individuals or objects are either not available or accurate assessment is not possible. They are then arranged in order according to some characteristic of interest. Such an ordered arrangement is called a ranking and the order given to an individual or object is called its rank. The correlation b/w two such sets of rankings are known as Rank Correlation.

Rank Correlation = (Spearman’s Formula)

Where d=difference b/w ranks of corresponding values of X and Yn= number of pairs of values (X, Y) in the data.

Rank Correlation for Tied RanksThe spearman’s coefficient or rank correlation applies only when no ties are present. In case there are ties in ranks, the ranks are adjusted by assigning the mean of the ranks which the tied objects or observations would have if they were ordered.

Rank Correlation for Tied =

Where t= tied values

Multiple Correlation

Page 37: Statistics Notes

Multiple correlation coefficient measures the degree of relationship b/w a variable and a group of variables and

variable is not included in that group e.g. ,

(1).

Or

(2).

Or

(3).

Or

Where

, ,

Hence are known as coefficient of multiple determination

Partial CorrelationCorrelation b/w two variable keeping the effects of all other variables as constant is called partial correlation for example

(1).

Or

Page 38: Statistics Notes

(2).

Or

(3).

Or

, ,

Chapter 7 ANALYSIS OF TIME SERIES

Time SeriesAn arrangement of data by successive time period is called time series. For example, the total monthly sales receipts in a departmental store, the annual yield of a crop in a country for a no of years, hourly temperature recorded at a locality for a period of years, the weekly prices of wheat in Lahore, the monthly consumption of electricity in a certain town, the monthly total of passengers carried by rail, the quarterly sales of a certain fertilizer, the annual rainfall at Karachi for a number of years, the enrolment of students in a college or university over a number of years and so forth.

Signal and NoiseSignal: The systematic component of variation in time series is called signal.Noise: An irregular or random component of variation in the time series is called noise.

Analysis of Time SeriesThe analysis of time series consists of the description, measurement, and isolation of the various components present in the series, this analysis helps the economists, businessmen and Planner etc.The value of the time series (Y) is the product effects of four components trend (T) , Cyclical (C), Seasonal (S) and Irregular (I) Movements.

But some statistical; consider the components of a time series fallow and additive law.

Components (Movements) of a Time SeriesA typical time series has four types of movements usually called components for a time series.

a. Secular Trend (T)b. Seasonal Movements or Seasonal Variation (S)c. Cyclical Movements or Cyclical Variation or Cyclical Fluctuation (C)d. Irregular, Accidental or Random Movements (I)

a. Secular Trend (T)These movements refer to long term variation which shows any tendencies of growth or decline over a long period approximately 30 to 40 years. These are smooth, steady and regular in nature for example, a continually increasing for more food due to population increase, a decline in death rate due to advance in science.

b. Seasonal Movements (S)

Page 39: Statistics Notes

These movements refer to short-term variations which generally occur due to seasonal effects within a period of one year. Climatic conditions including rainfall, heat and wind directly effect the time series for example, the demand of Woolen cloths increases in winter, the sale of shoes increase before EID, the price of wheat which fall after harvesting season and rise before the sowing time, the sales of soft drinks which are high in the summer and low in the winter, investments in Savings Certificates which are high in the months of May and June and low in other months, and so forth. The concept of seasonal variation is customarily broadened to include the more or less regular fluctuations of shorter duration occurring within a day, a week, a month, a quarter and so forth. Examples of such variations are the daily variations in temperature or the monthly variations in Bank deposits.

c. Cyclical Movements or Cyclical Variation or Cyclical Fluctuation (C)Statistical data in a number of cases show up and down movement periodically, there are swings from prosperity through recession, depression and recovery and back to prosperity, which is know as four phases of a business cycle ( Depression, Revival or Recovery , Prosperity or Boom , Contraction or Recession) and very important example of cyclical movements. These changes are repeated at intervals ranging from 7 to 10 years.

d. Irregular, Accidental or Random Movements (I)These movements are irregular and unsystematic in nature and happen as a result of abnormal events such as floods, earthquakes, wars and strikes etc. For example, Prices rise during war time, the production of industries goes down due to labour strikes.

Analysis the Secular Trend (T)There are four methods to measure the secular trend.2) The Freehand Curve Method3) The Method of Semi Averages4) The Method of Moving Averages5) The Method of Least Square

1) The Freehand Curve MethodIn this method the data are plotted on a graph measuring the time units (year, months, etc) along X-axis and the value of the time series variable along the Y-axis. A trend line or smooth curve is drawn through the graph in such a way that is shows the general tendency of the values. The trend values for different years (or months0 are read from the trend line or curve.

2) The Method of Semi AveragesThe freehand curve method, as we have seen, depends too much on personal judgment and gives subjective results. Another simple, method for measuring secular trend is the method of Simple Averages. In this method the data divided into two equal parts. (If the number of values is odd, either the middle value is left out or the series is divided unequally). The averages for each part are computed and places against the centre of each part. The averages are plotted and joined by a line. The line is extended to cover the whole data. Trend values corresponding to different time periods can be read from this trend line.

3) The Method of Moving AveragesWe have seen that the Freehand Curve Method is subjective because it is based too much on individual judgment. The method of Semi Average is appropriate only when the trend is liner. Another simple method which can also be used to eliminate seasonal, cyclical and irregular movements is the method of moving averages. In this method, we find the simple average successively taking a specific number of values at a time.For example, if we want to find 3-Year moving average, we shall find the average of the first three values, then drop the first value and include the fourth value. The process will be continued till all the values in the series are exhausted. The averages so obtained are placed in the middle of the group for which the average is calculated. When we find the moving average taking an even number of values, the middle of the group will lie b/w two years. In the order to make the average coincide with a particular year, we centre the averages by calculating further a 2-year moving average of the even order moving averages. The averages obtained are called moving averages (Centre).

4) The Method of Least SquareThe principle of least square states “the sum of squares of the deviations of the observed values from the corresponding estimated values should be least” in this method a straight line , Second degree

Page 40: Statistics Notes

parabola and a Third degree parabola are fitted to the observed time series by the method of least squares

a. Linear trend line

Normal Equations

We get the value of “a” and “b” solving the above equations simultaneously.

b. Second Degree Parabola \ Second Degree Trend Line

Normal Equations

We get the value of “a”, “b” and “c” solving the above equations simultaneously.

c. Third Degree Parabola \ Third Degree Trend line

Normal Equations

We get the value of “a”, “b”, “c” and “d” solving the above equations simultaneously.

Chapter 8 SAMPLING AND SAMPLING DISTRIBUTION

PopulationA group of all possible elements or objects are called population, for example, Human Population, the total number of students in college. The number of elements involved in population is called size of the population. It is denoted by N.

Page 41: Statistics Notes

Finite PopulationA population said to be finite if it consists of a finite or fixed number of elements for example, All university students In Pakistan, the weights of all students enrolled at Punjab University.

Infinite PopulationA population said to be infinite if there is not limit to the number of elements. For example, All heights between 2 and 3 meters.

Existent PopulationA population which consists of concrete objects is called an existent population.

Hypothetical PopulationA population which does not contain concrete objects or items is called hypothetical population.

SampleRepresentation small part of a population is called sample. The number of elements desired in sample is called sample size. It is denoted by n.

SamplingTechnique of selecting a true sample is called sampling. Sampling is broadly (mostly) distributed into two classes.

a) Probability or Random Samplingb) Non-probability or Non-random Sampling

a) Probability or Random SamplingTechnique of sampling where every sampling unit is selected untirely at random, therefore every sampling unit have same chances of selection in the sample, the probability involved in the selection of sampling unit such a technique is called probability sampling.Some Important probability samplings are1. Simple random sampling2. Stratified sampling3. Systematic sampling 4. Cluster sampling5. Multistage and Multiphase sampling

b) Non-probability or Non-random SamplingIn non probability sampling, the selection of the elements is not base on probability theory but the personal judgment plays a significant role in the selection of the sample the examples of non probability sampling are.1. Judgment or Purposive Sampling2. Quota Sampling

Sampling With Replacement (W.R)Sampling is said to be with replacement if the selected unit is replaced to the population before selecting the next unit, thus sampling unit can be selected more than once. For example, the sampling with replacement is Just like Prize bond scheme.The number of possible samples of size “n” from a population of size “N” using this technique will be . If we have a population containing 6 elements and like to draw all possible sample of size 2 talking with replacement sampling then number of possible sample will become 36 i.e. .

Sampling Without Replacement (W.O.R)Sampling is said to be without replacement if the selected unit is not replaced to the population before selecting the next unit, thus sampling unit can never be selected more than once. For example, the sampling without replacement is Just like Committee System.The number of possible sample of size “n” from a population of size “N” is obtained by using following formula

No. of Possible samples = = =

If for example, we have N=5 and n=2, the no. of possible samples will be 10 i.e.

Page 42: Statistics Notes

= = 10.

ParameterNumerical information or values drawn from population are called parameter. These are fixed numbers. It is usually denoted by Greek or capital letters. For example, population mean , and standard deviation .

StatisticNumerical information or values drawn from sample are called statistic. It vary from sample to sample from the same population. It is denoted by Roman or Small letters. For example, sample mean X and sample standard deviation S.

Sampling UnitsA basic element or object which we select for a sample are called sampling units. For example, if we want to measure the average height of college students are sampling units.

Sampling FrameThe complete list of all possible sampling units is called a “frame”.

CensusComplete enumeration of similar and dissimilar units is termed as census.

Sample SurveyIn a sample Survey, enumeration is limited to only a part, or a sample select from the population.

Preference to Sample Survey Over Complete SurveyWe prefer sample survey to complete survey due to

1) Reduced cost which we incur on sample2) Greater speed in presenting the result3) Greater scope of inquiry4) Greater accuracy

Sampling ErrorThe difference b/w parameter and statistic due to small size of sample is called sampling error. It can be reduced by increasing the sample size to a sufficient level.Sampling Error = Where = Sample Mean = Population Mean

Non-Sampling ErrorThe non-sampling error is those errors that arise due to defective sampling frame or information not being provided correctly. For example, income, Sale, Production and Age etc are not coated correctly in the most of the cases.

Sampling BiasBias is a cumulative component of error which arises due to defective selection of the sample or negligence of the investigator. Errors due to bias increase with an increase in the size of sample.

Standard ErrorThe standard deviation of a sampling distribution of statistic is called standard error (abbreviated to S.E).

Sampling DistributionFrequency distribution of statistics from all samples is called sampling distribution. For example, sampling distributions of sample mean or sample distribution of sample variance.

Simple Random Sampling

Page 43: Statistics Notes

Technique of sampling where every sampling unit is selected at random from a homogeneous population that every sampling unit have equal chances of selection in sample and every part of population have similar characteristics. For example, Random number table or lottery Method.

Stratified SamplingWhen a population has highly variable material, the simple random sampling fails to give accurate results. In this case our population is heterogeneous which is divided into homogenous subgroups called strata. Then a sample is selected separately from each strata at random and the combined into a single sample. This method is called stratified random sampling.

Systematic SamplingSystematic sampling is a method of selecting a sample that calls for taking every Kth element in the population. The first unit in the sample is selected at random from first 1 to K units the population and the every Kth unit is included in the sample.

Cluster SamplingCluster sampling a method of selection a sample in which population is divided into natural groups , such as household , agricultural forms, etc. which are called cluster and taking these clusters as sampling units, a sample is draw at random.

Quota SamplingQuota sampling is method of selecting a sample of convenience with certain controls to avoid some of the more serious biases involved in talking those most conveniently available. In those method quotas are setup example, by specifying number of interviews from urban and rural, males and females etc.Sampling DistributionFrequency distribution of statistics from all samples is called sampling distribution. For example, sampling distributions of sample mean or sample distribution of sample variance.

Population Size = N

Population = X

Sample Size = n

Population Mean =

Population Variance =

Population Standard Deviation =

Population Proportion = Where X is represent the number of even, odd or specific number.

Sample Proportion =

Sample Mean =

Biased Sample Variance = =

Biased Sample Standard Deviation = =

Page 44: Statistics Notes

Unbiased Sample Variance = =

Unbiased Sample Standard Deviation = =

Sample Draw with Replacement =

Sample Draw with Out Replacement = = =

Sampling Distribution of Mean

1)Mean of the sampling distribution of

2) Variance of the sampling distribution of

3) =Standard Deviation of the sampling distribution of

4)Population Mean =

5)Population Variance =

6)Population Standard deviation =

Verification for With Replacement (W.R)

a.

b.

c.

Verification for With Out Replacement (W.O.R)a.

b.

c.

Sampling Distribution of Difference b/w two means ( )

1)Mean of the sampling distribution of

Page 45: Statistics Notes

2)Variance of the sampling distribution of

3) =Standard Deviation of the sampling distribution of

4) Population Mean =

5)Population Mean =

6)Population Variance =

7)Population Variance =

8)Population Standard deviation =

9)Population Standard deviation =

Verification for With Replacement (W.R)

a.

b.

c.

Verification for With Out Replacement (W.O.R)

a.

b.

c.

Sampling Distribution of Sample Proportion ( )

1) Mean of the sampling distribution of

Page 46: Statistics Notes

2) Variance of the sampling distribution of

3) =Standard Deviation of the sampling distribution of

4) Population Mean = Where X is represent the number of even, odd or specific number.

Verification for With Replacement (W.R)

a.

b. where

c.

Verification for With Out Replacement (W.O.R)

a.

b.

c.

Sampling Distribution of Difference b/w two Proportion ( )

1)Mean of the sampling distribution of

2)Variance of the sampling distribution of

3) =Standard Deviation of the sampling distribution of

4) Population Mean =

Where is represent the number of even, odd or specific number

5) Population Mean =

Where is represent the number of even, odd or specific number

Verification for With Replacement (W.R)

a.

Page 47: Statistics Notes

b. where ,

c.

Verification for With Out Replacement (W.O.R)

a.

b.

c.

Sampling Distribution of Biased Variance ( )

1) Mean of the sampling distribution of

2) Variance of the sampling distribution of

3) =Standard Deviation of the sampling distribution of

4) Population Mean =

5) Population Variance =

6) Population Standard deviation =

Verification

Sampling Distribution of Un-Biased Variance ( )

1) Mean of the sampling distribution of

2) Variance of the sampling distribution of

3) =Standard Deviation of the sampling distribution of

Page 48: Statistics Notes

4) Population Mean =

5) Population Variance =

6) Population Standard deviation =

Verification

Chapter 9 Estimation

Confidence Interval for Population Mean With Replacement (Z-Test)

When Population Standard Deviation ( ) is known

Or

When Population Standard Deviation ( ) is unknown & n>30

Or

Confidence Interval for Population Mean With Out Replacement (Z-Test)

When Population Standard Deviation ( ) is known

Or

When Population Standard Deviation ( ) is unknown

Or

Page 49: Statistics Notes

Confidence Interval for Difference Between Population Mean ( ) With Replacement (Z-Test)

When population S.D ( ) is known

Or

When population S.D ( ) is unknown & >30

Or

Confidence Interval for Difference Between Population Mean ( ) With Replacement (Z-Test)

When population S.D ( ) is known

Or

When population S.D ( ) is unknown & >30

Or

Confidence Interval for Difference Between Population Mean ( ) With Out Replacement (Z-Test)

When population S.D ( ) is known

Page 50: Statistics Notes

Or

When population S.D ( ) is unknown & >30

Or

Confidence Interval for Difference Between Population Mean ( ) With Out Replacement (Z-Test)

When population S.D ( ) is known

Or

When population S.D ( ) is unknown & >30

Or

Confidence Interval for Population Proportion (Z-Test)

Or

Page 51: Statistics Notes

Confidence Interval for Difference Between Two Population Proportion ( ) (Z-Test)

Or

Confidence Interval for Difference Between Two Population Proportion ( ) (Z-Test)

Or

Confidence Interval Estimate for Population Correlation Coefficient (Z-Test)

Or

Chapter 10 Hypothesis Testing

Testing of Hypotheses concerning the Population Mean (Z-Test)

1. Null & Alternative Hypotheses

Null ;

Alternative ;

2. Significance Level

= 5 % / 1 % or 0.05 / 0.01If significance level is not given then we take 5% by default.

3. Critical Region(C.R)

Page 52: Statistics Notes

For Two Tail Test ( )

If Alternative ;

C.R = or

For One Tail Test ( )

If Alternative ;

C.R=

For One Tail Test ( )

If Alternative ;

C.R=

4. Test Statistics

When population S.D ( ) is known

Z=

When population S.D ( ) is unknown & n>30

Z=

5. Conclusion If z-cal is greater than or equal to z-tab so rejected

If z-cal is less than z-tab so accepted

Testing of Hypotheses concerning the difference between two Population Mean ( ) (Z-Test)

1. Null & Alternative Hypotheses

Null ;

Alternative ;

2. Significance Level

= 5 % / 1 % or 0.05 / 0.01If significance level is not given then we take 5% by default.

3. Critical Region(C.R)

For Two Tail Test ( )

If Alternative ;

C.R = or

For One Tail Test ( )

If Alternative ;

C.R=

For One Tail Test ( )

If Alternative ;

Page 53: Statistics Notes

C.R=

4. Test Statistics When population S.D ( ) is known

Z=

When population S.D ( ) is unknown & >30

Z=

If is not given in question then we take =0.

5. Conclusion If z-cal is greater than or equal to z-tab so rejected

If z-cal is less than z-tab so accepted

Testing of Hypotheses concerning the difference between two Population Mean ( ) (Z-Test)

1. Null & Alternative Hypotheses

Null ;

Alternative ;

2. Significance Level

= 5 % / 1 % or 0.05 / 0.01If significance level is not given then we take 5% by default.

3. Critical Region(C.R)

For Two Tail Test ( )

If Alternative ;

C.R = or

For One Tail Test ( )

If Alternative ;

C.R=

For One Tail Test ( )

If Alternative ;

C.R=

4. Test Statistics When population S.D ( ) is known

Z=

Page 54: Statistics Notes

When population S.D ( ) is unknown & >30

Z=

If is not given in question then we take =0.

5. Conclusion If z-cal is greater than or equal to z-tab so rejected

If z-cal is less than z-tab so accepted

Testing of Hypotheses concerning the Population Proportion (Z-Test)

1. Null & Alternative Hypotheses

Null ;

Alternative ;

2. Significance Level

= 5 % / 1 % or 0.05 / 0.01If significance level is not given then we take 5% by default.

3. Critical Region(C.R)

For Two Tail Test ( )

If Alternative ;

C.R = or

For One Tail Test ( )

If Alternative ;

C.R=

For One Tail Test ( )

If Alternative ;

C.R=

4. Test Statistics

Z=

Or

Z=

5. Conclusion If z-cal is greater than or equal to z-tab so rejected

If z-cal is less than z-tab so accepted

Testing of Hypotheses concerning the difference between two Population Proportion( )(Z-Test)

Page 55: Statistics Notes

1. Null & Alternative Hypotheses

Null ;

Alternative ;

2. Significance Level

= 5 % / 1 % or 0.05 / 0.01If significance level is not given then we take 5% by default.

3. Critical Region(C.R)

For Two Tail Test ( )

If Alternative ;

C.R = or

For One Tail Test ( )

If Alternative ;

C.R=

For One Tail Test ( )

If Alternative ;

C.R=

4. Test Statistics

Z=

Or

Z=

If is not given in question then we take =0.

5. Conclusion If z-cal is greater than or equal to z-tab so rejected

If z-cal is less than z-tab so accepted

Testing of Hypotheses concerning the difference between two Population Proportion( )(Z-Test)

1. Null & Alternative Hypotheses

Null ;

Alternative ;

2. Significance Level

= 5 % / 1 % or 0.05 / 0.01

Page 56: Statistics Notes

If significance level is not given then we take 5% by default.

3. Critical Region(C.R)

For Two Tail Test ( )

If Alternative ;

C.R = or

For One Tail Test ( )

If Alternative ;

C.R=

For One Tail Test ( )

If Alternative ;

C.R=

4. Test Statistics

Z=

Or

Z=

If is not given in question then we take =0.

5. Conclusion If z-cal is greater than or equal to z-tab so rejected

If z-cal is less than z-tab so accepted

Testing of Hypotheses concerning the Population Correlation Coefficient when ( or ) (Z-Test)

1. Null & Alternative Hypotheses

Null ;

Alternative ;

2. Significance Level

= 5 % / 1 % or 0.05 / 0.01If significance level is not given then we take 5% by default.

3. Critical Region(C.R)

For Two Tail Test ( )

If Alternative ;

C.R = or

Page 57: Statistics Notes

For One Tail Test ( )

If Alternative ;

C.R=

For One Tail Test ( )

If Alternative ;

C.R=

4. Test Statistics

Z=

5. Conclusion If z-cal is greater than or equal to z-tab so rejected

If z-cal is less than z-tab so accepted

Testing of Hypotheses concerning the difference between Population of two Correlation Coefficient ( ) (Z-Test)

1. Null & Alternative Hypotheses

Null ;

Alternative ;

2. Significance Level

= 5 % / 1 % or 0.05 / 0.01If significance level is not given then we take 5% by default.

3. Critical Region(C.R)

For Two Tail Test ( )

If Alternative ;

C.R = or

For One Tail Test ( )

If Alternative ;

C.R=

For One Tail Test ( )

If Alternative ;

C.R=

4. Test Statistics

Z=

Page 58: Statistics Notes

If is not given in question then we take =0.

5. Conclusion If z-cal is greater than or equal to z-tab so rejected

If z-cal is less than z-tab so accepted

Testing of Hypotheses concerning the difference between Population of two Correlation Coefficient ( ) (Z-Test)

1. Null & Alternative Hypotheses

Null ;

Alternative ;

2. Significance Level

= 5 % / 1 % or 0.05 / 0.01If significance level is not given then we take 5% by default.

3. Critical Region(C.R)

For Two Tail Test ( )

If Alternative ;

C.R = or

For One Tail Test ( )

If Alternative ;

C.R=

For One Tail Test ( )

If Alternative ;

C.R=

4. Test Statistics

Z=

If is not given in question then we take =0.

5. Conclusion If z-cal is greater than or equal to z-tab so rejected

If z-cal is less than z-tab so accepted

Page 59: Statistics Notes

Source of Variation

(S.O.V)

Degree Freedom

(d.f)Sum of Square (S.O.S) Mean Square (M.S)

F-Distribution

Treatment

(Sample)k-1

Treatment(Column)

S.S = Treatment M.S =

Error

E.S.S = T.S.S – Treatment(Column)

S.S

Error M.S =

TotalT.S.S =

1. Conclusion

If F-cal is greater than or equal to F-tab so rejected

If F-cal is less than F-tab so accepted

Chapter 11 Analysis of VarianceAnalysis of Variance Two way Classification or Two-Factor Experiment

1. Null & Alternative Hypotheses

RowNull ;

Alternative ; At least two means are not equal.Column

Null ;

Alternative At least two means are not equal

2. Significance Level

= 5 % / 1 % or 0.05 / 0.01If significance level is not given then we take 5% by default.

3. Critical Region(C.R)

RowC.R =

ColumnC.R =

4. Test Statistics

Row / Column

A1 A2 A3 …………. Ak

Page 60: Statistics Notes

B1B2B3...Bn

X11X21X31...Xn1

X12X22X32...Xn2

X13X23X33...Xn3

X1kX2kX3k...Xnk

=

Grand Total

=

=

I. Correction Factor = C.F = =

II. Total Sum of Square = T.S.S =

III. Column Sum of Square =C.S.S=

IV. Row Sum of Square =R.S.S=

V. Error Sum of square = E.S.S = T.S.S - Column .S.S – Row.S.S

ANOVA Table

Source of

Variation (S.O.V)

Degree Freedom (d.f) Sum of Square (S.O.S) Mean Square (M.S) F-Distribution

Column k-1

Column.S.S = Column.M.S =

Row r-1

Row.S.S =

Row.M.S=

ErrorE.S.S = T.S.S –

Treatment(Column) S.SError M.S =

Total n-1T.S.S =

Page 61: Statistics Notes

5. ConclusionRow If -cal is greater than or equal to -tab so rejected

If -cal is less than -tab so accepted Column If -cal is greater than or equal to -tab so rejected

If -cal is less than -tab so accepted