Hypothesis Testing

25
Hypothesis Testing Hypothesis Testing CJ 526 CJ 526

description

Hypothesis Testing. CJ 526. Probability. Review P = number of times an even can occur/ Total number of possible event Bounding rule of probability Minimum value is 0 Maximum value is 1. Probability. Probability of an event NOT occurring is the complement of an event - PowerPoint PPT Presentation

Transcript of Hypothesis Testing

Page 1: Hypothesis Testing

Hypothesis TestingHypothesis Testing

CJ 526CJ 526

Page 2: Hypothesis Testing

ProbabilityProbability

ReviewReview P = number of times an even can P = number of times an even can

occur/occur/ Total number of possible eventTotal number of possible event Bounding rule of probabilityBounding rule of probability Minimum value is 0Minimum value is 0 Maximum value is 1Maximum value is 1

Page 3: Hypothesis Testing

ProbabilityProbability

Probability of an event NOT occurring Probability of an event NOT occurring is the complement of an eventis the complement of an event

Probability of an illness = .2Probability of an illness = .2 Probability that illness will not occur = Probability that illness will not occur = 1=probability of event or 1 - .2 = .81=probability of event or 1 - .2 = .8 Odds of an event is the ratio Odds of an event is the ratio Odds of illness = .2/.8 or 1 to 4 or Odds of illness = .2/.8 or 1 to 4 or

odds of not getting ill are 4 to 1odds of not getting ill are 4 to 1

Page 4: Hypothesis Testing

Addition rule of pAddition rule of p

What is the probability of either one What is the probability of either one event OR another occurring?event OR another occurring?

If the events are mutually exclusive, If the events are mutually exclusive, simply add the probabilities (Venn simply add the probabilities (Venn diagram)diagram)

What is the p of having a boy or a What is the p of having a boy or a girl?girl?

P = 5. + .5 = 1P = 5. + .5 = 1

Page 5: Hypothesis Testing

Multiplication ruleMultiplication rule

What is the probability of A and B What is the probability of A and B occurring?occurring?

If the events are independent of one If the events are independent of one another, they can be multipliedanother, they can be multiplied

What is the p of having both What is the p of having both schizophrenia and epilepsy?schizophrenia and epilepsy?

Page 6: Hypothesis Testing

Probability distributionsProbability distributions

A probability distribution is A probability distribution is theoretical—we expect it based on theoretical—we expect it based on the laws of probabilitythe laws of probability

That is different from an empirical That is different from an empirical distribution—one which we actually distribution—one which we actually observeobserve

Page 7: Hypothesis Testing

Normal probability distributionNormal probability distribution

Probability distribution for continuous Probability distribution for continuous eventsevents

Probability of an event occurring is Probability of an event occurring is higher in the center of the curvehigher in the center of the curve

Declines for events at each of the Declines for events at each of the two ends (tails) of the distributiontwo ends (tails) of the distribution

Neither of the tails touches the x axis Neither of the tails touches the x axis (infinity)(infinity)

Page 8: Hypothesis Testing

Normal distributionNormal distribution

Theoretical probability distributionTheoretical probability distribution Unimodal, symmetrical, bell-shaped Unimodal, symmetrical, bell-shaped

curvecurve Symmetrical: draw a line down the Symmetrical: draw a line down the

center, left and right halves would be center, left and right halves would be mirror imagesmirror images

Can be expressed as a mathematical Can be expressed as a mathematical formula (p. 220)formula (p. 220)

Page 9: Hypothesis Testing

Normal distributionNormal distribution

Family of normal distributionsFamily of normal distributions Dependent on mean and SDDependent on mean and SD (Illustrate)(Illustrate) More spread out: larger SDMore spread out: larger SD Narrower: smaller SDNarrower: smaller SD

Page 10: Hypothesis Testing

VariationsVariations

SkewnessSkewness Skewed to the right or the left, as Skewed to the right or the left, as

opposed to symmetryopposed to symmetry Kurtosis: degree of “peakedness” or Kurtosis: degree of “peakedness” or

“flatness”“flatness”

Page 11: Hypothesis Testing

Area under the normal curveArea under the normal curve

Remember that for any continuous Remember that for any continuous distribution there is a mean and SDdistribution there is a mean and SD

Example: Mean = 10 and SD = 2Example: Mean = 10 and SD = 2 If the distribution is not skewed, the If the distribution is not skewed, the

majority (2/3) of scores will be from 8 majority (2/3) of scores will be from 8 to 12to 12

8 and 12 are each one SD from the 8 and 12 are each one SD from the meanmean

See p. 225See p. 225

Page 12: Hypothesis Testing

Area under the normal curveArea under the normal curve

If a distribution is normal, we can If a distribution is normal, we can express standard deviation in terms express standard deviation in terms of z scoresof z scores

A z score = (a score – the mean)/SDA z score = (a score – the mean)/SD If we convert all our raw scores to z If we convert all our raw scores to z

scores, then we get what is call the scores, then we get what is call the standard normal distributionstandard normal distribution

It STANDARDIZES our scoresIt STANDARDIZES our scores

Page 13: Hypothesis Testing

Standard normal distributionStandard normal distribution

Then distributions of different Then distributions of different measures can be compared against measures can be compared against one anotherone another

The standard normal distribution has a The standard normal distribution has a mean of 0 and an SD of onemean of 0 and an SD of one

If you use the formula for z scores, all If you use the formula for z scores, all the scores can be convertedthe scores can be converted

If a distribution has a mean of 10, the If a distribution has a mean of 10, the z score for 10 will be (10-10)/SD = 0z score for 10 will be (10-10)/SD = 0

Page 14: Hypothesis Testing

Standard normal distributionStandard normal distribution

If a distribution has a mean of 10 and If a distribution has a mean of 10 and an SD of 2, the z score for 12 would an SD of 2, the z score for 12 would be z = (12-10)/2 = 1be z = (12-10)/2 = 1

The z score for 8 would be z = (8-The z score for 8 would be z = (8-10)/2 = -110)/2 = -1

The negative and positive sign have The negative and positive sign have meaning: a + sign means a score is meaning: a + sign means a score is above the mean above the mean

Page 15: Hypothesis Testing

Standard normal distributionStandard normal distribution

A minus sign means the score is less A minus sign means the score is less than the meanthan the mean

The z score also tell about magnitudeThe z score also tell about magnitude—the larger the z score, the further —the larger the z score, the further from the mean, and the smaller the z from the mean, and the smaller the z score, the closer to the meanscore, the closer to the mean

Page 16: Hypothesis Testing

Standard normal distributionStandard normal distribution

We can also make statements about We can also make statements about where an individual score is in where an individual score is in relation to the rest of the distributionrelation to the rest of the distribution

.3413 (or 34.13%) of scores will fall .3413 (or 34.13%) of scores will fall between the mean and 1 SDbetween the mean and 1 SD

.3413 (or 34.13%) of scores will fall .3413 (or 34.13%) of scores will fall between the mean and – 1 SDbetween the mean and – 1 SD

Page 17: Hypothesis Testing

Standard normal distributionStandard normal distribution

.6826 (0r 68.26) of scores will be .6826 (0r 68.26) of scores will be between -1 and + 1 SD on a normal between -1 and + 1 SD on a normal distributiondistribution

Thus, when we see a mean and SD, if Thus, when we see a mean and SD, if it is normally distributed, about 2/3 it is normally distributed, about 2/3 of the scores will fall between the of the scores will fall between the mean – the SD and the mean + the mean – the SD and the mean + the SDSD

Page 18: Hypothesis Testing

Standard normal distributionStandard normal distribution

50% of the scores will be above the 50% of the scores will be above the meanmean

50% of the scores will be below the 50% of the scores will be below the meanmean

.1359 (13.59%) will fall between -1 and -.1359 (13.59%) will fall between -1 and -2 SD and between +1 and +2 SD2 SD and between +1 and +2 SD

.0215 (2.15%) will fall between -2 and -3 .0215 (2.15%) will fall between -2 and -3 SD and +2 and +3 SDSD and +2 and +3 SD

See p. 223, illustrateSee p. 223, illustrate

Page 19: Hypothesis Testing

Standardized normal distributionStandardized normal distribution

Tells us about any distributionTells us about any distribution Example of IQ scores, mean = 100, Example of IQ scores, mean = 100,

SD = 15SD = 15 About 2/3 between 85 and 115About 2/3 between 85 and 115 Less (13.5%) between 115 and 130, Less (13.5%) between 115 and 130,

and 70 and 85and 70 and 85 About 2% between 130 and 145, and About 2% between 130 and 145, and

55 and 7055 and 70

Page 20: Hypothesis Testing

Standardized normal Standardized normal

SAT scores, mean = 500, SD = 100SAT scores, mean = 500, SD = 100 IllustrateIllustrate

Use of z table, p. 724Use of z table, p. 724 Reading the tableReading the table

Page 21: Hypothesis Testing

Utility of the normal distributionUtility of the normal distribution

Use of the normal distribution Use of the normal distribution underlies many statistical testsunderlies many statistical tests

Many variables not normally Many variables not normally distributeddistributed

However, the normal distribution However, the normal distribution useful anyway because of the useful anyway because of the apparently validity of the Central apparently validity of the Central Limit TheoremLimit Theorem

Page 22: Hypothesis Testing

Sampling distributionsSampling distributions

To understand the Central Limit To understand the Central Limit Theorem, need to understand Theorem, need to understand sampling distributionssampling distributions

Say we draw many samples, and Say we draw many samples, and calculate a statistic for each sample, calculate a statistic for each sample, such as a meansuch as a mean

When we draw the samples, the When we draw the samples, the mean will not be the same each timemean will not be the same each time—there will be variation—there will be variation

Page 23: Hypothesis Testing

Sampling distributionsSampling distributions

If you were to obtain some measure If you were to obtain some measure on several samples of patients with on several samples of patients with the same disorder, there would be the same disorder, there would be variation in the mean of the measure variation in the mean of the measure for each sample. for each sample.

There is an actual mean for the There is an actual mean for the entire population of patients that entire population of patients that have the disorder, but that is not have the disorder, but that is not known, because we don’t have known, because we don’t have measures for the whole populationmeasures for the whole population

Page 24: Hypothesis Testing

Sampling distributionsSampling distributions

However, we could obtain means However, we could obtain means based on a large number of samplesbased on a large number of samples

Central limit theorem: if an infinite Central limit theorem: if an infinite number of random samples of size n number of random samples of size n are drawn from a population, the are drawn from a population, the sampling distribution of the sample sampling distribution of the sample means will itself approach being means will itself approach being normally distributed (even if the normally distributed (even if the measure is not itself normally measure is not itself normally distributed)distributed)

Page 25: Hypothesis Testing

Number of subjectsNumber of subjects

With sample sizes greater than 100, With sample sizes greater than 100, the Central Limit Theorem can be the Central Limit Theorem can be usedused

If the measure is not terribly skewed, If the measure is not terribly skewed, then samples could be around 50then samples could be around 50

With sample sizes of less than 50, the With sample sizes of less than 50, the central limit theorem probably should central limit theorem probably should not be used.not be used.

Application of the central limit Application of the central limit theorem (ex)theorem (ex)