Module on Basic Statistical Concepts

download Module on Basic Statistical Concepts

of 21

Transcript of Module on Basic Statistical Concepts

  • 8/9/2019 Module on Basic Statistical Concepts

    1/21

    asic Statistical

    asic Statistical

    Conceptsoncepts

  • 8/9/2019 Module on Basic Statistical Concepts

    2/21

    Definition of TermsDefinition of Terms

    StatisticsStatistics refer to the science that deals with therefer to the science that deals with thecollection, tabulation or presentation, analysis, andcollection, tabulation or presentation, analysis, andinterpretation of numerical or quantitative data.interpretation of numerical or quantitative data.

    Collection of data refers to the process of obtainingCollection of data refers to the process of obtaining

    numerical measurements. Tabulation or presentation ofnumerical measurements. Tabulation or presentation ofdata refers to the organization of data into tables,data refers to the organization of data into tables,graphs or charts, so that logical and statisticalgraphs or charts, so that logical and statisticalconclusion can be derived from the collectedconclusion can be derived from the collectedmeasurements. Analysis of data pertains to the processmeasurements. Analysis of data pertains to the processof extracting the given data relevant information fromof extracting the given data relevant information from

    which numerical description can be formulated.which numerical description can be formulated.Interpretation of data refers to the tas of drawingInterpretation of data refers to the tas of drawingconclusions from the analyzed data. It also normallyconclusions from the analyzed data. It also normallyinvolves the formulation of forecasts or prediction aboutinvolves the formulation of forecasts or prediction aboutlarger groups based on the data collected from smalllarger groups based on the data collected from small

    group.group.

  • 8/9/2019 Module on Basic Statistical Concepts

    3/21

    Population and Sample

    opulation and Sample

    PopulationPopulationor theor the universeuniverserefers to the collection ofrefers to the collection ofall traits under study or under consideration. A small part ofall traits under study or under consideration. A small part ofthis big group is called athis big group is called a sample.sample.

    !xample of population and sample" graduate students!xample of population and sample" graduate studentsof #!$%T are an example of population while students inof #!$%T are an example of population while students in

    the &'A program are a sample.the &'A program are a sample.$sing the language of mathematics, the universal set$sing the language of mathematics, the universal set

    is the population while the subset refers to the sample.is the population while the subset refers to the sample.(ence, If the universal set is a set of counting numbers,(ence, If the universal set is a set of counting numbers,the set of even numbers is a subset, so with the set of oddthe set of even numbers is a subset, so with the set of odd

    numbers.numbers.A population can be finite or infinite. The population ofA population can be finite or infinite. The population of

    a certain school in a particular term isa certain school in a particular term is finitefinite while thewhile thepopulation consisting of all possible outcomes )heads, tailspopulation consisting of all possible outcomes )heads, tails

    in successive tosses of coin* isin successive tosses of coin* is

    infiniteinfinite

    .

    .

  • 8/9/2019 Module on Basic Statistical Concepts

    4/21

    Parameter and Statistics

    arameter and Statistics

    A parameter refers to the numerical characteristic of theA parameter refers to the numerical characteristic of thepopulation lie the population mean, population standardpopulation lie the population mean, population standarddeviation, population variance, and many more. It isdeviation, population variance, and many more. It isusually unnown and estimated only by a correspondingusually unnown and estimated only by a correspondingstatistic computed from the sample data. Thus, thestatistic computed from the sample data. Thus, thepopulation mean is estimated by the sample mean,population mean is estimated by the sample mean,

    population standard deviation through the samplepopulation standard deviation through the samplestandard deviation, the population variance by the samplestandard deviation, the population variance by the samplevariance, etc. The mean weight of a sample of +variance, etc. The mean weight of a sample of +sophomore students selected from the entire populationsophomore students selected from the entire populationof the sophomore students in a certain high school is aof the sophomore students in a certain high school is astatistics. The mean weight of all students comprising thestatistics. The mean weight of all students comprising the

    population is a parameter, which is estimated by thepopulation is a parameter, which is estimated by thesample mean weight of the sophomore students.sample mean weight of the sophomore students.

    -enerally, the characteristics of a population are called-enerally, the characteristics of a population are calledparametersparameters,, while the characteristics of a sample arewhile the characteristics of a sample arecalledcalled statisticsstatistics..

  • 8/9/2019 Module on Basic Statistical Concepts

    5/21

    elow are different symbols for parameters

    elow are different symbols for parameters

    and statistics in most statistical writing:

    nd statistics in most statistical writing:

    CharacteristicsCharacteristics ParameterParameter StatisticStatistic

    &ean&ean )mu*)mu* x x%tandard eviation%tandard eviation )sigma* s)sigma* s

    /ariance/ariance 00 s s00

    'roportion'roportion ' ' p p

    'earson Correlation Coef.'earson Correlation Coef. 1 1 r r#umber of Cases#umber of Cases # # n n

  • 8/9/2019 Module on Basic Statistical Concepts

    6/21

    Variables

    ariables

    VariableVariableis one of the basic concepts inis one of the basic concepts instatistics. It refers to observablestatistics. It refers to observablecharacteristics or phenomena of a personcharacteristics or phenomena of a personor ob2ect whereby the members of theor ob2ect whereby the members of thegroup or set vary or differ from onegroup or set vary or differ from oneanother. A variable is a symbol such as 3,another. A variable is a symbol such as 3,4, 5, a, b, c, etc. which can assume any4, 5, a, b, c, etc. which can assume any

    domain of the variable. If the variable candomain of the variable. If the variable canassume only one value it is called aassume only one value it is called aconstant. )e.g. 6constant. )e.g. 6 **

  • 8/9/2019 Module on Basic Statistical Concepts

    7/21

    Discrete and Continuous

    iscrete and Continuous

    VariablesariablesA variable which can be theoretically assumeA variable which can be theoretically assumeany value between two given values is called aany value between two given values is called acontinuous variablecontinuous variable, otherwise it is called a, otherwise it is called adiscrete variablediscrete variable..

    !xample" the number of houses in a!xample" the number of houses in acommunity is a discrete variable6 it can be measurecommunity is a discrete variable6 it can be measureany of the values , +, 0, 7, etc. but cannot be +.8,any of the values , +, 0, 7, etc. but cannot be +.8,7.79, 9.:09, etc.7.79, 9.:09, etc.

    The weight of an individual, which can be 98.7The weight of an individual, which can be 98.7

    g., 8.8 g., ;.798 g., etc depending on theg., 8.8 g., ;.798 g., etc depending on theaccuracy of measurement, is a continuous variable.accuracy of measurement, is a continuous variable.

    In general, measurement gives rise toIn general, measurement gives rise tocontinuous data while enumeration or countingcontinuous data while enumeration or countinggives rise to discrete data.gives rise to discrete data.

  • 8/9/2019 Module on Basic Statistical Concepts

    8/21

    Dependent and

    ependent and

    Independent Variablesndependent Variables/ariables can be grouped into dependent and/ariables can be grouped into dependent andindependent variables with respect on their use.independent variables with respect on their use.Independent variableIndependent variable is used as predictor if theis used as predictor if theob2ective is to predict the value of one variable on theob2ective is to predict the value of one variable on the

    basis of the other. Contrary to this,basis of the other. Contrary to this, dependent variabledependent variablemeans the variable whose value is predicted. Tomeans the variable whose value is predicted. Toillustrate, if we want to predict or foresee the students

  • 8/9/2019 Module on Basic Statistical Concepts

    9/21

    Uses of Statistics

    ses of Statistics

    According to Ary and >acobs )+?;:*, statistics is aAccording to Ary and >acobs )+?;:*, statistics is abody of scientific methods for analyzing quantitativebody of scientific methods for analyzing quantitativedata. %tatistics produces two functions" )+* they aid thedata. %tatistics produces two functions" )+* they aid thescientist in organizing, summarizing, interpreting andscientist in organizing, summarizing, interpreting andcommunicating quantitative information obtained fromcommunicating quantitative information obtained fromobservations and )0* they allow scientist to extrapolateobservations and )0* they allow scientist to extrapolatethe data to reach tentative conclusions about the largerthe data to reach tentative conclusions about the largergroup from which the smallest group was derived. Thegroup from which the smallest group was derived. Thestatistical procedure dealing with the first function arestatistical procedure dealing with the first function are

    generally calledgenerally called descriptive statisticsdescriptive statistics )gathering,)gathering,classification, presentation of data and collection ofclassification, presentation of data and collection ofsummarizing values* while the procedures dealing withsummarizing values* while the procedures dealing withthe second function are calledthe second function are called inferential statisticsinferential statistics)critical 2udgement and mathematical methods*.)critical 2udgement and mathematical methods*.

  • 8/9/2019 Module on Basic Statistical Concepts

    10/21

    Types of Data

    ypes of Data

    %tatistical tools rely on the types of data that are collected.%tatistical tools rely on the types of data that are collected.Among the different types are as follows"Among the different types are as follows"

    Primary and Secondary DataPrimary and Secondary Data

    Primary dataPrimary data refer to information which are gathered directlyrefer to information which are gathered directlyfrom the original source or which are based on direct or first handfrom the original source or which are based on direct or first hand

    experience )e.g. @ autobiographies, diaries, etc.*.experience )e.g. @ autobiographies, diaries, etc.*. Secondary dataSecondary datarefer to information which are taen from published or unpublishedrefer to information which are taen from published or unpublisheddata which are previously gathered by other individuals or agenciesdata which are previously gathered by other individuals or agencies)e.g.6 boos, magazines, newspapers, etc.*.)e.g.6 boos, magazines, newspapers, etc.*.

    Qualitative and Quantitative DataQualitative and Quantitative Data

    Qualitative dataQualitative data are categorized data, which tae the form ofare categorized data, which tae the form ofcategories or attributes )e.g. 6 sex, year level, religion, etc.*. =n thecategories or attributes )e.g. 6 sex, year level, religion, etc.*. =n theother hand,other hand, quantitative dataquantitative dataoror numerical datanumerical dataare obtained fromare obtained frommeasurements )e.g. @ height, weight, ages, scores, etc.*.measurements )e.g. @ height, weight, ages, scores, etc.*.

  • 8/9/2019 Module on Basic Statistical Concepts

    11/21

    Measurement Scales

    easurement Scales

    ualitative data can be converted to quantitative data through theualitative data can be converted to quantitative data through theprocess calledprocess called measurementsmeasurements. By measurements, numbers are utilized to. By measurements, numbers are utilized tocode ob2ects in order that they can be treated statistically. There are four typescode ob2ects in order that they can be treated statistically. There are four typesof measurements. They are as follows"of measurements. They are as follows"

    Nominal Measurements.Nominal Measurements. #ominal measurements are used only for#ominal measurements are used only foridentification or classification purposes. !xample" students numbers, names ofidentification or classification purposes. !xample" students numbers, names ofboos, number of vehicles, etc.boos, number of vehicles, etc.

    Ordinal Measurements.Ordinal Measurements. =rdinal measurements do not only classify items.=rdinal measurements do not only classify items.They also give the order of classes, items or ob2ects. !xample" first runner6up,They also give the order of classes, items or ob2ects. !xample" first runner6up,second runner6up, third runner6up, etc.second runner6up, third runner6up, etc.

    Interval Measurements .Interval Measurements . In interval measurements, numbers are assigned toIn interval measurements, numbers are assigned tothe items or ob2ects. They measure the degree of differences between any twothe items or ob2ects. They measure the degree of differences between any two

    classes. !xample" weight, height, temperature, I, test scores, etc.classes. !xample" weight, height, temperature, I, test scores, etc.

    Ratio Measurements .Ratio Measurements . or ratio measurements, the ratio of the numbersor ratio measurements, the ratio of the numbersassigned in the measurements shows the ratio in the amount of property beingassigned in the measurements shows the ratio in the amount of property beingmeasured. &ultiplication and division have meanings in ratio measurements.measured. &ultiplication and division have meanings in ratio measurements.!xample" Boris is 9 years old and &organa is 0 years old, then their ages!xample" Boris is 9 years old and &organa is 0 years old, then their agesmay be expressed in the ratio 0"+ )two is to one*.may be expressed in the ratio 0"+ )two is to one*.

  • 8/9/2019 Module on Basic Statistical Concepts

    12/21

    Sampling Tecni!ues

    ampling Tecni!ues

    It is not necessary for the researcher to examine every member of theIt is not necessary for the researcher to examine every member of thepopulation to get data or information about the population. Cost and timepopulation to get data or information about the population. Cost and timeconstraints will prohibit one from undertaing a study of the entire population.constraints will prohibit one from undertaing a study of the entire population.Sampling techniquesSampling techniques are utilized to test the validity of conclusions orare utilized to test the validity of conclusions orinferences from the sample of population.inferences from the sample of population.

    Random Samlin!.Random Samlin!. Dhat is random samplingEDhat is random samplingE Random samplingRandom sampling is ais amethod of selecting sample size from a population or universe such that eachmethod of selecting sample size from a population or universe such that eachmember of the population has an equal chance of being selected in the samplemember of the population has an equal chance of being selected in the sampleand all possible combinations of size have an equal chance of being selected asand all possible combinations of size have an equal chance of being selected asthe sample.the sample.

    Strati"ied Random Samlin!.Strati"ied Random Samlin!. In this method the population is first divided intoIn this method the population is first divided into

    groups @ based on homogeneity @ in order to avoid possibility of drawinggroups @ based on homogeneity @ in order to avoid possibility of drawingsamples whose members come only from one stratum.samples whose members come only from one stratum.

    Cluster Samlin!.Cluster Samlin!. ItIt is the advantageous procedure when the population isis the advantageous procedure when the population isspread out over a wide geographical area. It is also means as a practicalspread out over a wide geographical area. It is also means as a practicalsampling technique used if the complete list of the members of the population issampling technique used if the complete list of the members of the population isnot available. Anot available. A clustercluster refers to an intact group which has a commonrefers to an intact group which has a common

    characteristics.characteristics.

  • 8/9/2019 Module on Basic Statistical Concepts

    13/21

    Metods Used in te

    etods Used in te

    Collection of Dataollection of Data

    #. Direct or Intervie$ Method#. Direct or Intervie$ Method

    This is a method of person6to6person exchange betweenThis is a method of person6to6person exchange betweenthe interviewer and the interviewee.the interviewer and the interviewee.

    The following are the advantage of the direct or interviewThe following are the advantage of the direct or interviewmethod"method"

    +.+. It can give complete information needed in the study.It can give complete information needed in the study.

    0.0. It can yield inaccurate information since the interviewer canIt can yield inaccurate information since the interviewer can

    influence the respondent

  • 8/9/2019 Module on Basic Statistical Concepts

    14/21

    %. Indirect or Questionnaire Method%. Indirect or Questionnaire Method

    The questionnaire method is one of the easiestThe questionnaire method is one of the easiestmethods of data gathering. In this method, writtenmethods of data gathering. In this method, written

    responses are given to prepared questions. Aresponses are given to prepared questions. Aquestionnaire is a list of questions which arequestionnaire is a list of questions which areintended to elicit answer to the problems of a study.intended to elicit answer to the problems of a study.It should be attractive, includes illustrations,It should be attractive, includes illustrations,pictures, and setches. Its contents, especially thepictures, and setches. Its contents, especially the

    directions, must be precise, clear, and self6directions, must be precise, clear, and self6explanatory.explanatory.

    &. Re!istration Method&. Re!istration Method

    This method of gathering information is enforced byThis method of gathering information is enforced by

    certain law. !xamples are the registration of births,certain law. !xamples are the registration of births,deaths, motor vehicles, marriages, and licenses.deaths, motor vehicles, marriages, and licenses.The advantage of this method is that information isThe advantage of this method is that information isept systematized and made available to allept systematized and made available to allbecause of the requirement of the law.because of the requirement of the law.

  • 8/9/2019 Module on Basic Statistical Concepts

    15/21

    '. Observation Method'. Observation Method

    =bservation method is utilized to gather=bservation method is utilized to gatherdata regarding attitudes, behavior, values,data regarding attitudes, behavior, values,

    and cultural patterns of the sample underand cultural patterns of the sample under

    study. It is usually used when the sub2ectsstudy. It is usually used when the sub2ects

    cannot tal or write.cannot tal or write.

    (. )*eriment Method(. )*eriment Method

    An experiment is applied to collect data ifAn experiment is applied to collect data ifthe investigator wants to control the factorsthe investigator wants to control the factors

    affecting the variable being studied.affecting the variable being studied.

  • 8/9/2019 Module on Basic Statistical Concepts

    16/21

    Metods of Presenting

    etods of Presenting

    Dataata

    Collected data are useless andCollected data are useless and

    invalid if they are not presentedinvalid if they are not presented

    effectively for analyses andeffectively for analyses andinterpretations. ata are presented ininterpretations. ata are presented in

    four general methods" F+G texturalfour general methods" F+G textural

    method, F0G tabular method, F7G semi6method, F0G tabular method, F7G semi6tabular method, and F9G graphicaltabular method, and F9G graphical

    method or presentation.method or presentation.

  • 8/9/2019 Module on Basic Statistical Concepts

    17/21

    "re!uency Distribution

    re!uency Distribution

    Dhen the researcher gathers allDhen the researcher gathers all

    the needed data, the next tas is tothe needed data, the next tas is to

    organize and present them with theorganize and present them with theuse of appropriate tables and graphs.use of appropriate tables and graphs.

    requency distribution is one systemrequency distribution is one system

    used to facilitate the description ofused to facilitate the description ofimportant features of the data.important features of the data.

  • 8/9/2019 Module on Basic Statistical Concepts

    18/21

    Class Interval or Class +imits Class Interval or Class +imits refers to the grouping defined byrefers to the grouping defined bya lower limit and an upper limit.a lower limit and an upper limit.

    Class -oundaries Class -oundaries if heights are recorded to the nearest inch,if heights are recorded to the nearest inch,the class interval : @ :0 theoretically includes all measurementsthe class interval : @ :0 theoretically includes all measurementsfrom 8?.8 to :0.8 in. These numbers, indicated briefly byfrom 8?.8 to :0.8 in. These numbers, indicated briefly bythe exact numbers 8?.8 and :0.8, arethe exact numbers 8?.8 and :0.8, are class boundariesclass boundaries, or the, or thetrue class limitstrue class limitsH the smaller number F8?.8G is the lower classH the smaller number F8?.8G is the lower classboundary, and the larger number F:0.8G is the upper classboundary, and the larger number F:0.8G is the upper class

    boundary.boundary.

    Class Mar/ Class Mar/ is the midpoint or middle of a class interval.is the midpoint or middle of a class interval.!xample" it is obtained by finding the average of the lower class!xample" it is obtained by finding the average of the lower classlimit and the upper class limit. The class mar of the class limitlimit and the upper class limit. The class mar of the class limit8 @ ? is F8 ?GJ0 or ;.8 @ ? is F8 ?GJ0 or ;.

    Class Si0e Class Si0e refers to the difference between the upper classrefers to the difference between the upper classboundary and the lower class boundary of a class interval.boundary and the lower class boundary of a class interval.

    Class 1re2uency Class 1re2uency means the number of observation belongingmeans the number of observation belongingto a class interval.to a class interval.

  • 8/9/2019 Module on Basic Statistical Concepts

    19/21

    #rapical Presentation of

    rapical Presentation of

    Data

    ata

    3isto!ram3isto!ram 6 is made up of vertical bars that are 2oined together, maing6 is made up of vertical bars that are 2oined together, maingan appropriate graph for continuous data. The base of each bar oran appropriate graph for continuous data. The base of each bar orrectangle is equal to the class boundaries, wherein heightrectangle is equal to the class boundaries, wherein heightcorresponding to its class frequency.corresponding to its class frequency.

    1re2uency Poly!on1re2uency Poly!on@ is commonly called@ is commonly called linear graphlinear graph. It is very useful. It is very useful

    device to show changes in values over successive periods of time.device to show changes in values over successive periods of time.An advantage of the frequency distribution is that it can be used toAn advantage of the frequency distribution is that it can be used tocompare two or more distributions graphically on one pair of axes.compare two or more distributions graphically on one pair of axes.

    -ar 4rah-ar 4rah @ is used to represent discrete data, where the bars are@ is used to represent discrete data, where the bars areseparated. The length of each bar is arbitrary. (owever, the barsseparated. The length of each bar is arbitrary. (owever, the barsmust be of the same width. Thus, the bar graph is almost lie as themust be of the same width. Thus, the bar graph is almost lie as the

    histogram, the only difference is that the bars of the histogram arehistogram, the only difference is that the bars of the histogram are2oined.2oined.

    Pie Dia!ram or Pie ChartPie Dia!ram or Pie Chart @ is used to show percentage distribution. It is@ is used to show percentage distribution. It ismade up a circle subdivided into sectors proportional in size to themade up a circle subdivided into sectors proportional in size to thequantities or percentages they represent.quantities or percentages they represent.

  • 8/9/2019 Module on Basic Statistical Concepts

    20/21

    Types of "re!uency Cur$es

    ypes of "re!uency Cur$es

    +. The+. The symmetrical or bellshaedsymmetrical or bellshaed frequency curves, frequency curvesfrequency curves, frequency curvesare characterized by the fact that observations equidistant from theare characterized by the fact that observations equidistant from thecentral maximum have the same frequency. An important example iscentral maximum have the same frequency. An important example isthe normal curve.the normal curve.

    0. In0. In 5shaed5shaed andand reversed 5shaedreversed 5shaed frequency curves, a maximumfrequency curves, a maximumoccurs at the end.occurs at the end.

    7. In the7. In the moderately asymmetrical or s/e$edmoderately asymmetrical or s/e$edfrequency curves, the tailfrequency curves, the tailof the curve to one side of the central maximum is longer than that toof the curve to one side of the central maximum is longer than that tothe other. If the longer tail occurs to the right, the curve is said to bethe other. If the longer tail occurs to the right, the curve is said to besewed to the right or have positive sewness, while if the reverse issewed to the right or have positive sewness, while if the reverse istrue, the curve is said to be sewed to the left or have negativetrue, the curve is said to be sewed to the left or have negativesewness.sewness.

    9. A9. A 6shaed6shaedfrequency curve has maxima at both ends.frequency curve has maxima at both ends.

    8. A8. A bimodalbimodalfrequency curve has two maxima.frequency curve has two maxima.

    :. A:. A multimodalmultimodalfrequency curve has more than two maxima.frequency curve has more than two maxima.

  • 8/9/2019 Module on Basic Statistical Concepts

    21/21

    %ymmetrical %ewed to the right %ewed to the left >6shaped=r Bell6shaped )positive %ewness* )negative %ewness*

    1eversed >6shaped $6shaped Bimodal &ulti6modal

    Illustration"