Lecture 6: Descriptive Statistics: Probability, Distribution, Univariate Data.

30
Lecture 6: Lecture 6: Descriptive Statistics: Descriptive Statistics: Probability, Probability, Distribution, Univariate Distribution, Univariate Data Data
  • date post

    22-Dec-2015
  • Category

    Documents

  • view

    229
  • download

    0

Transcript of Lecture 6: Descriptive Statistics: Probability, Distribution, Univariate Data.

Page 1: Lecture 6: Descriptive Statistics: Probability, Distribution, Univariate Data.

Lecture 6: Lecture 6:

Descriptive Statistics: Probability, Descriptive Statistics: Probability, Distribution, Univariate Data Distribution, Univariate Data

Page 2: Lecture 6: Descriptive Statistics: Probability, Distribution, Univariate Data.

22

AgendaAgenda

Wrap-up of experimental methodsWrap-up of experimental methods

Intro to probabilityIntro to probability

Examining data through univariate Examining data through univariate statisticsstatistics

Page 3: Lecture 6: Descriptive Statistics: Probability, Distribution, Univariate Data.

33

Generalizability (external validity) in Generalizability (external validity) in ExperimentsExperiments

Threats to external validity always involve Threats to external validity always involve an interaction of the treatment group with an interaction of the treatment group with some other factor.some other factor.

Threats usually fall into 3 types:Threats usually fall into 3 types: SettingSetting PopulationPopulation HistoryHistory

Page 4: Lecture 6: Descriptive Statistics: Probability, Distribution, Univariate Data.

44

Three threats to generalizability in Three threats to generalizability in experimentsexperiments

SettingSetting Physical and social context of the experimentPhysical and social context of the experiment

PopulationPopulation Is there something specific about the sample that Is there something specific about the sample that

interacts with the treatment?interacts with the treatment?

HistoryHistory Is there something about the time that interacts with Is there something about the time that interacts with

the treatment?the treatment?

Page 5: Lecture 6: Descriptive Statistics: Probability, Distribution, Univariate Data.

55

Why Generalizability is not always Why Generalizability is not always a problema problem

Experiments often are trying to isolate specific Experiments often are trying to isolate specific causes and effects in controlled settings. Thus, causes and effects in controlled settings. Thus, they may not even be claiming to be they may not even be claiming to be generalizable to specific settings.generalizable to specific settings.

Experimental findings can provide theoretical Experimental findings can provide theoretical basis for real-world tests. basis for real-world tests.

It is often a balancing act for research: true It is often a balancing act for research: true causation versus large-scale associational and causation versus large-scale associational and comparative testing.comparative testing.

Page 6: Lecture 6: Descriptive Statistics: Probability, Distribution, Univariate Data.

66

Considerations before using Considerations before using experimentsexperiments

Cost and EffortCost and Effort Is the effort worth it to test the concepts you are Is the effort worth it to test the concepts you are

interested in?interested in?

Manipulation and ControlManipulation and Control Will you actually be able to manipulate the key Will you actually be able to manipulate the key

concept(s)?concept(s)?

Importance of GeneralizabilityImportance of Generalizability Are you testing theory, or trying to establish a real-Are you testing theory, or trying to establish a real-

world test? world test?

Page 7: Lecture 6: Descriptive Statistics: Probability, Distribution, Univariate Data.

77

Page 8: Lecture 6: Descriptive Statistics: Probability, Distribution, Univariate Data.

88

ProbabilityProbability

Are the things that we observe different Are the things that we observe different from what would be expected by chance?from what would be expected by chance?

Coin ExampleCoin Example

Page 9: Lecture 6: Descriptive Statistics: Probability, Distribution, Univariate Data.

99

Probability ConceptsProbability Concepts

Basic Concepts in ProbabilityBasic Concepts in Probability

Basic Probability Rules Basic Probability Rules

Special Types of ProbabilitySpecial Types of Probability Joint ProbabilitiesJoint Probabilities Probabilities of Unions of EventsProbabilities of Unions of Events Conditional ProbabilitiesConditional Probabilities

Page 10: Lecture 6: Descriptive Statistics: Probability, Distribution, Univariate Data.

1010

Basic Concepts in Basic Concepts in Elementary ProbabilityElementary Probability

Random SelectionRandom Selection Every possibility has equal chance of being chosen.Every possibility has equal chance of being chosen.

IndependenceIndependence The probability of a response on one trial does not depend on The probability of a response on one trial does not depend on

the outcome of any other trials.the outcome of any other trials.

Elementary EventElementary Event Possible outcomes of a probability experimentPossible outcomes of a probability experiment E.g., each coin toss E.g., each coin toss

Sample SpaceSample Space The complete set of elementary eventsThe complete set of elementary events E.g., all coin tossesE.g., all coin tosses

Page 11: Lecture 6: Descriptive Statistics: Probability, Distribution, Univariate Data.

1111

Mutually exclusive, exhaustive, Mutually exclusive, exhaustive, eventsevents

Mutually exclusive eventsMutually exclusive events Two or more events that cannot occur at the Two or more events that cannot occur at the

same time.same time.

Exhaustive eventsExhaustive events A set of events that accounts for all of the A set of events that accounts for all of the

elementary events in the sample space.elementary events in the sample space.

Page 12: Lecture 6: Descriptive Statistics: Probability, Distribution, Univariate Data.

1212

Basic rules of probabilityBasic rules of probability

Multiplication RuleMultiplication Rule For For independentindependent events, we can multiply the events, we can multiply the

probabilities together to get the probability for probabilities together to get the probability for all of the events occurring.all of the events occurring.

Example: Probability of rolling a die and getting 6 Example: Probability of rolling a die and getting 6 on both rolls.on both rolls.

But what happens if the events But what happens if the events are notare not independent? independent?

Example: probability of selecting a club from a deck of Example: probability of selecting a club from a deck of cards, then selecting another club (without cards, then selecting another club (without replacement)?replacement)?

Page 13: Lecture 6: Descriptive Statistics: Probability, Distribution, Univariate Data.

1313

Multiplication RuleMultiplication Rule

when two or more events will happen at the same time, when two or more events will happen at the same time, and the events and the events areare independent, then the special rule of independent, then the special rule of multiplication law is used to find the joint probability:multiplication law is used to find the joint probability:

P(X and Y) = P(X) x P(Y)P(X and Y) = P(X) x P(Y)

when two or more events will happen at the same time, when two or more events will happen at the same time, and the events and the events areare dependent, then the general rule of dependent, then the general rule of multiplication law is used to find the joint probability:multiplication law is used to find the joint probability:

P(X and Y) = P(X) x P(Y|X)P(X and Y) = P(X) x P(Y|X)

Page 14: Lecture 6: Descriptive Statistics: Probability, Distribution, Univariate Data.

1414

Basic rules of probability Basic rules of probability (continued)(continued)

The addition ruleThe addition rule For For independentindependent events, we can add the events, we can add the

probabilities to get the probability of probabilities to get the probability of eithereither event occurring.event occurring.

Example: Rolling die and getting a 4 Example: Rolling die and getting a 4 oror a 6. a 6.

Again, what happens if the events are Again, what happens if the events are notnot independent (in this case, mutually exclusive)?independent (in this case, mutually exclusive)?

Page 15: Lecture 6: Descriptive Statistics: Probability, Distribution, Univariate Data.

1515

Addition RuleAddition Rule

When two or more events will happen at the same time, and the When two or more events will happen at the same time, and the events events areare mutually exclusive, then: mutually exclusive, then:

P(X or Y) = P(X) + P(Y) P(X or Y) = P(X) + P(Y)

When two or more events will happen at the same time, and the When two or more events will happen at the same time, and the events events are notare not mutually exclusive, then: mutually exclusive, then:

P(X or Y) = P(X) + P(Y) - P(X and Y)P(X or Y) = P(X) + P(Y) - P(X and Y)

For example, what is the probability that a card chosen at random For example, what is the probability that a card chosen at random from a deck of cards will either be a king or a heart?from a deck of cards will either be a king or a heart?P(King or Heart) = P(X or Y) = 4/52 + 13/52 - 1/52 = 30.77%P(King or Heart) = P(X or Y) = 4/52 + 13/52 - 1/52 = 30.77%

Page 16: Lecture 6: Descriptive Statistics: Probability, Distribution, Univariate Data.

1616

Special Types of ProbabilitySpecial Types of Probability

Joint ProbabilitiesJoint Probabilities

Probabilities of Unions of EventsProbabilities of Unions of Events

Conditional ProbabilitiesConditional Probabilities

Page 17: Lecture 6: Descriptive Statistics: Probability, Distribution, Univariate Data.

1717

Example: Female and Download Example: Female and Download Music from Class SurveyMusic from Class Survey

Page 18: Lecture 6: Descriptive Statistics: Probability, Distribution, Univariate Data.

1818

Joint ProbabilitiesJoint Probabilities

Probability of obtaining a particular combination of Probability of obtaining a particular combination of events.events.

E.g., probability of flipping a coin twice and getting heads both E.g., probability of flipping a coin twice and getting heads both times. Just use multiplication rule!times. Just use multiplication rule!

P (A and B) = n(A and B) / n (S)P (A and B) = n(A and B) / n (S)

What about non-independent events?What about non-independent events?E.g., probability of a given respondent in class survey being female E.g., probability of a given respondent in class survey being female and having downloaded music before.and having downloaded music before.

P (A and B) = p(A|B) x p(B)P (A and B) = p(A|B) x p(B) (9/11) (.579) = .474(9/11) (.579) = .474

Page 19: Lecture 6: Descriptive Statistics: Probability, Distribution, Univariate Data.

1919

Union ProbabilitiesUnion Probabilities

A union of two elementary events consists of all A union of two elementary events consists of all the elementary events belonging to the elementary events belonging to eithereither of of them.them.

Examples:Examples: Probability of flipping a coin and it being heads or Probability of flipping a coin and it being heads or

tails. (mutually exclusive union)tails. (mutually exclusive union)

Non-independent events: Probability of being a Non-independent events: Probability of being a female or having downloaded music before.female or having downloaded music before.

p(E1) + p(E2) – p(E1 and E2)p(E1) + p(E2) – p(E1 and E2) (.579) + (.842) – (.474) = .947(.579) + (.842) – (.474) = .947

Page 20: Lecture 6: Descriptive Statistics: Probability, Distribution, Univariate Data.

2020

Conditional ProbabilityConditional Probability

Probability of an event occurring given that Probability of an event occurring given that another event has occurred.another event has occurred. Example: probability of an outcome, given Example: probability of an outcome, given

that something else has occurred.that something else has occurred.

3 Doors Problem3 Doors Problem

Page 21: Lecture 6: Descriptive Statistics: Probability, Distribution, Univariate Data.

2121

Probability and StatisticsProbability and Statistics

Statistics deal with what we Statistics deal with what we observeobserve and how it and how it compares to what might be compares to what might be expected by chanceexpected by chance..

A set of probabilities corresponding to each A set of probabilities corresponding to each possible value of some variable, X, creates a possible value of some variable, X, creates a probability distributionprobability distribution Common examples include normal (Gaussian), Common examples include normal (Gaussian),

Poisson, Exponential, Binomial, etcPoisson, Exponential, Binomial, etc

Page 22: Lecture 6: Descriptive Statistics: Probability, Distribution, Univariate Data.

2222

The Normal CurveThe Normal Curve

Page 23: Lecture 6: Descriptive Statistics: Probability, Distribution, Univariate Data.

2323

Page 24: Lecture 6: Descriptive Statistics: Probability, Distribution, Univariate Data.

2424

For now, we will just deal with For now, we will just deal with describing or characterizingdescribing or characterizing the the distribution of a single variabledistribution of a single variable

Page 25: Lecture 6: Descriptive Statistics: Probability, Distribution, Univariate Data.

2525

Describing Simple Distributions of Describing Simple Distributions of DataData

Central TendencyCentral Tendency Some way of “typifying” a distribution of values, Some way of “typifying” a distribution of values,

scores, etc.scores, etc.Mean (sum of scores divided by number of scores)Mean (sum of scores divided by number of scores)

Median (middle score, as found by rank)Median (middle score, as found by rank)

Mode (most common value from set of values)Mode (most common value from set of values)

In a normal distribution, all 3 measures are equal.In a normal distribution, all 3 measures are equal.

Example: Class stats knowledgeExample: Class stats knowledge

Page 26: Lecture 6: Descriptive Statistics: Probability, Distribution, Univariate Data.

2626

DispersionDispersion

RangeRange Difference between highest value and the lowest value.Difference between highest value and the lowest value.

Standard DeviationStandard Deviation A statistic that describes how tightly the values are clustered A statistic that describes how tightly the values are clustered

around the mean.around the mean.

VarianceVariance A measure of how much spread a distribution has. A measure of how much spread a distribution has. Computed as the average squared deviation of each value from Computed as the average squared deviation of each value from

its meanits mean

Page 27: Lecture 6: Descriptive Statistics: Probability, Distribution, Univariate Data.

2727

Properties of Standard DeviationProperties of Standard Deviation

Variance is just the square of Variance is just the square of the S.D.the S.D.

If a constant is added to all If a constant is added to all scores, it has no impact on S.D.scores, it has no impact on S.D.

If a constant is multiplied to all If a constant is multiplied to all scores, it will affect the scores, it will affect the dispersion (S.D. and variance)dispersion (S.D. and variance)

S = standard deviationX = individual scoreM = mean of all scoresn = sample size (number of scores)

Page 28: Lecture 6: Descriptive Statistics: Probability, Distribution, Univariate Data.

2828

Page 29: Lecture 6: Descriptive Statistics: Probability, Distribution, Univariate Data.

2929

Common Data RepresentationsCommon Data Representations

HistogramsHistograms Simple graphs of the frequency of groups of scores.Simple graphs of the frequency of groups of scores.

Stem-and-Leaf DisplaysStem-and-Leaf Displays Another way of displaying dispersion, particularly Another way of displaying dispersion, particularly

useful when you do not have large amounts of data.useful when you do not have large amounts of data.

Box PlotsBox Plots Yet another way of displaying dispersion. Boxes show Yet another way of displaying dispersion. Boxes show

7575thth and 25 and 25thth percentile range, line within box shows percentile range, line within box shows median, and “whiskers” show the range of values (min median, and “whiskers” show the range of values (min and max)and max)

Page 30: Lecture 6: Descriptive Statistics: Probability, Distribution, Univariate Data.

3030