Introduction to Statistics Basic Concepts. Intro. to Statistics What is Statistics? What is...
-
date post
20-Dec-2015 -
Category
Documents
-
view
218 -
download
2
Transcript of Introduction to Statistics Basic Concepts. Intro. to Statistics What is Statistics? What is...
Introduction to StatisticsIntroduction to Statistics
Basic ConceptsBasic Concepts
Intro. to StatisticsIntro. to Statistics
What is Statistics?What is Statistics?• “…“…a set of procedures and rules…for a set of procedures and rules…for
reducing large masses of data to reducing large masses of data to manageable proportions and for manageable proportions and for allowing us to draw conclusions from allowing us to draw conclusions from those data”those data”
Intro. to StatisticsIntro. to Statistics
What can Stats do?What can Stats do?• Make data more manageableMake data more manageable
Group of numbers:Group of numbers:
6, 1, 8, 3, 5, 4, 96, 1, 8, 3, 5, 4, 9 Average is: 36/7 = 5 1/7Average is: 36/7 = 5 1/7 Graphs:Graphs:
0
10
20
30
40
50
60
70
80
90
1st Qtr 2nd Qtr 3rd Qtr 4th Qtr
East
West
North
Intro. to StatisticsIntro. to Statistics
What can Stats do?What can Stats do?• Allow us to draw conclusions from the Allow us to draw conclusions from the
datadata Group of numbers #1: 6, 1, 8, 3, 5, 4, 9Group of numbers #1: 6, 1, 8, 3, 5, 4, 9 Average is 5 1/7Average is 5 1/7 Group of numbers #2: 8, 3, 4, 2, 7, 1, 4Group of numbers #2: 8, 3, 4, 2, 7, 1, 4 Average is 4 ¼Average is 4 ¼
• Allows us to do this Allows us to do this objectivelyobjectively and and quantitativelyquantitatively
Intro. to StatisticsIntro. to Statistics
““Quantitative”Quantitative”• Involves Involves
measurementmeasurement• Data in numerical Data in numerical
formform• Answers “How Answers “How
much” questionsmuch” questions• Objective and Objective and
results in results in unambiguous unambiguous conclusionsconclusions
““Qualitative”Qualitative”• Describes the Describes the
nature of somethingnature of something• Answers “What” or Answers “What” or
“Of what kind” “Of what kind” questionsquestions
• Often evaluative Often evaluative and ambiguousand ambiguous
Intro. to StatisticsIntro. to Statistics
Qualitative Distinctions:Qualitative Distinctions:• ““Good” versus “Bad”Good” versus “Bad”• ““Right” versus “Wrong”Right” versus “Wrong”• ““A Lot” versus “A Little”A Lot” versus “A Little”
Quantitative Distinctions:Quantitative Distinctions:• 5 1/7 versus 4 ¼5 1/7 versus 4 ¼• 25% versus 50%25% versus 50%• 1 hour versus 24 hours1 hour versus 24 hours
Basic TerminologyBasic Terminology
Summarizing versus AnalyzingSummarizing versus Analyzing Descriptive StatisticsDescriptive Statistics Inferential StatisticsInferential Statistics
• Inference from Inference from samplesample to to populationpopulation• Inference from Inference from statisticstatistic to to parameterparameter• Factors influencing the accuracy of a sample’s Factors influencing the accuracy of a sample’s
ability to represent a population:ability to represent a population: Size Size RandomnessRandomness
Basic TerminologyBasic Terminology
• Size – Size – Sample of 5 cards from a deck of 52Sample of 5 cards from a deck of 52
• 2 of Clubs, 10 of Diamonds, Jack of Hearts, 5 of 2 of Clubs, 10 of Diamonds, Jack of Hearts, 5 of Clubs, and 7 of HeartsClubs, and 7 of Hearts
What could we conclude about the full deck What could we conclude about the full deck from this sample about what the full deck from this sample about what the full deck looks like without any prior knowledge of a looks like without any prior knowledge of a deck of cards?deck of cards?
Compare this to a sample of 51/52 cards – Compare this to a sample of 51/52 cards – What could we conclude from this sample?What could we conclude from this sample?
Basic TerminologyBasic Terminology
• Randomness – Randomness – This time lets use the same 5 card sample, This time lets use the same 5 card sample,
but this time the deck is unshuffled but this time the deck is unshuffled (nonrandom)(nonrandom)
• 2 of Clubs, 10 of Clubs, Jack of Clubs, 5 of Clubs, 2 of Clubs, 10 of Clubs, Jack of Clubs, 5 of Clubs, and 7 of Clubsand 7 of Clubs
What would we conclude about the What would we conclude about the characteristics of our population (the deck) characteristics of our population (the deck) this time versus when the sample was more this time versus when the sample was more random (shuffled)?random (shuffled)?
Basic TerminologyBasic Terminology
Smaller/less random samples both Smaller/less random samples both poorly represent population of entire poorly represent population of entire deck of cardsdeck of cards• Also result in inaccurate inferences Also result in inaccurate inferences
about population – poor about population – poor external validityexternal validity
Basic TerminologyBasic Terminology
Most often, the aim of our research is Most often, the aim of our research is not to infer characteristics of a not to infer characteristics of a population from our sample, but to population from our sample, but to compare two samplescompare two samples• I.e. To determine if a particular I.e. To determine if a particular
treatment works, we compare two treatment works, we compare two groups or samples, one with the groups or samples, one with the treatment and one withouttreatment and one without
Basic TerminologyBasic Terminology
• We draw conclusions based on how similar the We draw conclusions based on how similar the two groups aretwo groups are
If the treated and untreated groups are very similar, If the treated and untreated groups are very similar, we cannot declare the treatment much of a successwe cannot declare the treatment much of a success
Another way of putting this in terms of Another way of putting this in terms of samples and populations is determining if samples and populations is determining if our two groups/samples actually come our two groups/samples actually come from the same population, or two different from the same population, or two different onesones
Basic TerminologyBasic Terminology
Group A (Treated) and B (Untreated) Group A (Treated) and B (Untreated) are sampled from different are sampled from different populations/treatment worked:populations/treatment worked:
Group APopulation of Well People
Group BPopulation of Sick People
Basic TerminologyBasic Terminology
Group A and B are sampled from the Group A and B are sampled from the same population/treatment didn’t same population/treatment didn’t work:work:
Group AGroup B
Population of Sick People
Basic TerminologyBasic Terminology
What if Group A (who received the What if Group A (who received the Tx) were sicker then Group B (who Tx) were sicker then Group B (who did not receive Tx), prior to did not receive Tx), prior to treatment? What would their scores treatment? What would their scores look like after Tx?look like after Tx?• The inability to attribute changes in the The inability to attribute changes in the
variable of interest to the manipulation – variable of interest to the manipulation – poor poor internal validityinternal validity
I.e. we can’t say for sure if our experiment I.e. we can’t say for sure if our experiment worked or notworked or not
Basic TerminologyBasic Terminology
Quantitative DataQuantitative Data• Dimensional/Measurement DataDimensional/Measurement Data versus versus
Categorical/Frequency Count DataCategorical/Frequency Count Data DimensionalDimensional
• When quantities of something are measured on a When quantities of something are measured on a continuumcontinuum
• Answers “how much” questionsAnswers “how much” questions• I.e. scores on a test, measures of weight, etc.I.e. scores on a test, measures of weight, etc.
Basic TerminologyBasic Terminology
CategoricalCategorical• When numbers of discrete entities have to be When numbers of discrete entities have to be
countedcounted Gender is an example of a discrete entity – Gender is an example of a discrete entity –
you can be either male or female, and nothing you can be either male or female, and nothing else – speaking of “degree of maleness” else – speaking of “degree of maleness” makes little sensemakes little sense
• Answers “how many” questionsAnswers “how many” questions• I.e. number of men and women, percentage of I.e. number of men and women, percentage of
people with a given hair colorpeople with a given hair color
Basic TerminologyBasic Terminology
A dimensional variable can be A dimensional variable can be converted into a categorical oneconverted into a categorical one• Convert scores on a test (0-100) into Convert scores on a test (0-100) into
“Low”, “Medium”, and “High” groups – “Low”, “Medium”, and “High” groups – 0-33 = Low; 34-66 = Medium, and 67-0-33 = Low; 34-66 = Medium, and 67-100 = High100 = High
The groups are discrete categories (hence The groups are discrete categories (hence “categorical”), and you would now count “categorical”), and you would now count how many people fall into each categoryhow many people fall into each category
Basic ConceptsBasic Concepts
Scales of Measurement:Scales of Measurement:• NominalNominal
labeling/classifying objectslabeling/classifying objects i.e. your last name, names on jerseys, social security i.e. your last name, names on jerseys, social security
number, etc.number, etc. notnot technically a scale of technically a scale of measurement measurement since nothing since nothing
is measuredis measured
• OrdinalOrdinal labels that imply ranklabels that imply rank i.e. place in a race, military rank – 1i.e. place in a race, military rank – 1stst > 2 > 2ndnd > 3 > 3rdrd and and
General > Lieutenant > PrivateGeneral > Lieutenant > Private doesn’tdoesn’t say how much more one is than the other say how much more one is than the other
Basic ConceptsBasic Concepts IntervalInterval
• provides labels that imply exactly how much different provides labels that imply exactly how much different one label is than anotherone label is than another
• i.e. temperature - 15i.e. temperature - 15° F is 5 ° F more than 10 ° F° F is 5 ° F more than 10 ° F• lacks true zero point - 0 ° F does not represent the lacks true zero point - 0 ° F does not represent the
complete complete absenceabsence of heat because we have negative of heat because we have negative values of °F values of °F
RatioRatio• has all of the above, has all of the above, plusplus a true zero point a true zero point• i.e. height, weight, ° Kelvin – 0 lbs represents a true lack i.e. height, weight, ° Kelvin – 0 lbs represents a true lack
of weightof weight• can talk about 16 ° being four times 4 °, which is a can talk about 16 ° being four times 4 °, which is a
proportion /ratio, hence the name of the scale - x = 4yproportion /ratio, hence the name of the scale - x = 4y• often very difficult to identify in practice if a true zero often very difficult to identify in practice if a true zero
point existspoint exists
Basic ConceptsBasic Concepts
Scales of Scales of MeasurementMeasurement• NominalNominal
• OrdinalOrdinal
• IntervalInterval
• RatioRatio
• QualitativeQualitative
• QuantitativeQuantitative
Basic ConceptsBasic Concepts
VariablesVariables• Discrete versus Continuous VariablesDiscrete versus Continuous Variables
same as Categorical versus Dimensional variablessame as Categorical versus Dimensional variables
• Not to be confused with “discreet” variables, Not to be confused with “discreet” variables, that people simply do not think should be talked that people simply do not think should be talked aboutabout
Basic ConceptsBasic Concepts
Constant Variable
Qualitative Quantitative
Categorical/Discrete
Dimensional/Continuous
Nominal Ordinal Interval Ratio
Basic ConceptsBasic Concepts Variables versus ConstantsVariables versus Constants
• A A constantconstant has only one possible value that it has only one possible value that it can assumecan assume
ππ = 3.1415923536… = 3.1415923536…• A A variablevariable can assume many possible values can assume many possible values
X = ?X = ? Independent Variables (IV’s) versus Independent Variables (IV’s) versus
Dependent Variables (DV’s)Dependent Variables (DV’s)• IV IV manipulatedmanipulated, DV , DV measuredmeasured• Whether a variable is a DV or IV depends upon Whether a variable is a DV or IV depends upon
the design of the experimentthe design of the experiment
Basic ConceptsBasic Concepts
VariablesVariables• In In true experimentstrue experiments, the effects of one variable , the effects of one variable
(the IV) are manipulated to see the effects on (the IV) are manipulated to see the effects on another variable (the DV)another variable (the DV)
• All other factors other than the IV are kept All other factors other than the IV are kept constant so that we can attribute the change constant so that we can attribute the change to the IV and not to something elseto the IV and not to something else
• ExampleExample: Influence of direct heat on the : Influence of direct heat on the temperature of watertemperature of water
IV = presence or absence of heatIV = presence or absence of heat DV = temperature of waterDV = temperature of water