Introducing z -scores & the normal distribution

33
Statistics 1 1 Introducing z-scores & the normal distribution Z-scores & the normal distribution are used with data that is: Continuous Use real limits Interval or Ratio level Use M, SS, , & 2

description

Introducing z -scores & the normal distribution Z-scores & the normal distribution are used with data that is: Continuous  Use real limits Interval or Ratio level  Use M, SS,  , &  2. Frequency distributions & relative frequencies: X = levels or values of the variable. - PowerPoint PPT Presentation

Transcript of Introducing z -scores & the normal distribution

Page 1: Introducing  z -scores & the normal distribution

Statistics 1 1

Introducing z-scores & the normal distribution

Z-scores & the normal distribution are used with data that is:

Continuous Use real limits

Interval or Ratio levelUse M, SS, , & 2

Page 2: Introducing  z -scores & the normal distribution

Statistics 1 2

Frequency distributions & relative frequencies:

X = levels or values of the variable.f = observed data.relative f = f/n.

Relative frequency: the proportion of observations in a given X interval..36 or 36% of the observations are in the X=3 interval (2.5-3.5).

X Real limits f Relative f

5 4.5 – 5.5 2 .1818

4 3.5 – 4.5 2 .1818

3 2.5 – 3.5 4 .3636

2 1.5 – 2.5 2 .1818

1 0.5 – 1.5 1 .0909

Page 3: Introducing  z -scores & the normal distribution

Statistics 1 3

Relative frequency: the probability of observing X in a given interval.

p(3) = .36 and p(2.5<X<3.5)=.36Probability of observing an X value = its relative frequency.Probability of observing an X above/below a particular score = the

sum of the probabilities above/below the interval that contains X.

Figure 1. Attitudes toward the death penalty.

.0909

.1818

.3636

.1818 .1818

0

0.5

1

1.5

2

2.5

3

3.5

4

4.5

.5-1.5 1.5-2.5 2.5-3.5 3.5-4.5 4.5-5.5

1 2 3 4 5

1=strongly oppose; 3=neutral; 5=strongly support

freq

uen

cy

Page 4: Introducing  z -scores & the normal distribution

Statistics 1 4

Normal Distributions

Family of symmetrical, unimodal distributions with different & ².

Describe many of the variables of psychological interest.

IQ is a normally distributed variable with =100 & =15

Height is a normally distributed variable with =68’’ & =6’’

Area under the curve corresponds to proportion/probability.

Page 5: Introducing  z -scores & the normal distribution

Statistics 1 5

Any normal distribution can be transformed into the unit normal distribution.

Values range from - ∞ to + ∞ =0 & =1

The z-score transformation is used to turn distributions of raw scores into standardized scores or z-scores.

Used to estimate probabilities of outcomes & set critical values.

Page 6: Introducing  z -scores & the normal distribution

Statistics 1 6

When the z-score transformation is used with a normally distributed set of scores, you get the unit normal distribution.

Each raw score has been transformed into a z-score using this formula:

X

z

zX

Page 7: Introducing  z -scores & the normal distribution

Statistics 1 7

More about z-scoresX is the raw score of interest is the mean of the raw score population is the standard deviation of the raw scores

Z-scores express the distance between the observation & the mean in standard deviation units.If the observation is 1 standard deviation above the mean, z = +1.For SAT scores, with =500 & =100

If raw SAT score=400, z=____Hint, 400 is 1 standard deviation below the mean.

Page 8: Introducing  z -scores & the normal distribution

Statistics 1 8

More about z-scoresX is the raw score of interest is the mean of the raw score population is the standard deviation of the raw scores

Z-scores express the distance between the observation & the mean in standard deviation units.If the observation is 1 standard deviation above the mean, z = +1.For SAT scores, with =500 & =100

If raw SAT score=400, z=____Hint, 400 is 1 standard deviation below the mean.

1100

100

100

500400

X

z

Page 9: Introducing  z -scores & the normal distribution

Statistics 1 9

If raw SAT score=650, z=____

Page 10: Introducing  z -scores & the normal distribution

Statistics 1 10

If raw SAT score=650, z=____

indicates direction above or below the mean.

Absolute value indicates distance from the mean.

The mean has a z-score of 0.

5.1100

150

100

500650

X

z

Page 11: Introducing  z -scores & the normal distribution

Statistics 1 11

Z-scores allow us to make statements about the relative location of a score in a distribution.

For distribution (a), a test score of 76 is higher than most of the other scores:

For distribution (b), a test score of 76 is only slightly above the mean:

Page 12: Introducing  z -scores & the normal distribution

Statistics 1 12

Z-scores allow us to make comparisons between scores from different distributions.

Which score is more impressive: an IQ=130 or an SAT=650?

Page 13: Introducing  z -scores & the normal distribution

Statistics 1 13

Z-scores allow us to make comparisons between scores from different distributions.

Which score is more impressive: an IQ=130 or an SAT=650?

IQ=130 (z = +2) is more impressive than an SAT=650 (z = +1.5).

Page 14: Introducing  z -scores & the normal distribution

Statistics 1 14

When you transform an entire distribution into z-scores, you have z-score distribution or a standardized distribution.

with =0 & =1

& the same shape as the original distribution

Converting frequency distribution (a) to a z-score distribution (b) has not changed the shape of the distribution.

Page 15: Introducing  z -scores & the normal distribution

Statistics 1 15

If the raw score distribution is normal, standardization produces the unit normal distribution.

For the normal distribution, the relative frequencies/ probabilities/ proportions of the area under the curve marked by z-scores are known.

Page 16: Introducing  z -scores & the normal distribution

Statistics 1 16

Mean = 3, z = 0.

Page 17: Introducing  z -scores & the normal distribution

Statistics 1 17

34.13% of the normal distribution lies between the mean & a z-score of +1.

2.28% of the normal distribution lies below a z-score of -2.

Page 18: Introducing  z -scores & the normal distribution

Statistics 1 18

z-scores & the unit normal distributionWe will use the unit normal distribution to find probabilities

associated with observations & to set critical values.

.025 of the unit normal distribution lies above z = +1.96;

.025 of the unit normal distribution lies below z = -1.96.

Page 19: Introducing  z -scores & the normal distribution

Statistics 1 19

The unit normal table (B1) lists proportions of the normal distribution for each z-score value.

Proportions under the curve are relative frequencies/probabilities.

Page 20: Introducing  z -scores & the normal distribution

Statistics 1 20

Introduction to probability: The binomial

Descriptive statistics: summarize, organize, & simplify data.

Inferential statistics: use samples to draw conclusions about the population.

If sample is typical of what we would expect, conclude the sample comes from the specified population.

If sample is very unusual or improbable, conclude the sample does NOT come from the specified population.

This logic requires us to quantify 2 things:– Our expectations about the population &– What we mean by improbable or unusual.

Page 21: Introducing  z -scores & the normal distribution

Statistics 1 21

Observe 6 rats in a Y maze: 5 turn right, 1 turns left.

If chance alone is operating,

50% of the rats should turn right.

p(right turn)=.5

If chance alone is operating,

3 rats should turn right.

Xe=p(right turn)*n= .5 * 6 = 3

Page 22: Introducing  z -scores & the normal distribution

Statistics 1 22

Have we observed anything unusual?

Our expectation is what we think should happen based on what we know about the population parameter, p(turning right); Xe=3.

Our observation is what actually happened with our sample; Xo=5.

To specify our expectation (Xe), we needed to know sample size (n) & the parameter, p(turning right).

If 5 rats turn right when we only expect 3, is this evidence that something unusual is going on?

Page 23: Introducing  z -scores & the normal distribution

Statistics 1 23

We also need some criterion for deciding how different Xo has to be from Xe before we can conclude our observation was unusual.“Unusual” means they do NOT come from the population of

rats who are equally likely to turn left or right.

How extreme must Xo be for us to conclude that these rats come from some other population, where p(turning right) ≠.5?

Inferential statistics uses probability to set precise criteria for deciding whether or not a sample is likely to have come from a given population.

The binomial distribution is used to calculate probabilities for observations of nominal data with 2 categories.

Page 24: Introducing  z -scores & the normal distribution

Statistics 1 24

Definitions & notation• Xe: the # of events expected to display the characteristic of

interest.• Xo: the # events that actually display the characteristic of

interest.• X: a possible value of Xo.• Null hypothesis or Ho: specifies expectations as population

parameters.• n: the # of events (people, coin flips, items, trials, etc…)

observed.• Probability: likelihood of observing a particular event class. • P(A): probability of observing an outcome or characteristic

belonging to event class A—the one you’re interested in.• Q(B): probability of observing the only other possible outcome

or characteristic, belonging to event class B—the “other one.”• Note that P+Q=1.00.

Page 25: Introducing  z -scores & the normal distribution

Statistics 1 25

Parameters for the binomial: P & QTo know what is expected in the population, we must know the

parameters; for binomial these are P. & Q.How do we know what the population parameters are?Prior knowledge

90% of people are right handed, so P(right handed)=.90Definition of the situation/chance

Rats could only turn left or right, so P(right turn)=.50For binomial data, Ho specifies the proportion of the population

that belongs to the event class of interest.Ho: P(right handed)=.90 Ho: P(right turn)=.50

Why not just write Ho in terms of Xe instead of P & Q?Xe changes as n changes, but P & Q apply to all possible n’s.

Page 26: Introducing  z -scores & the normal distribution

Statistics 1 26

Assumptions for using the binomialAny statistical test is only appropriate when the sample data meet

certain requirements. For the binomial, these are:Random sampling

Every member of the population has an EQUAL chance of being selected.

p(selection) = 1/ N

If more than one member is selected, there must be a constant probability for each and every selection.

p(selection) always = 1/ N, never 1/N-1, 1/N-2…

Use sampling with replacement for finite populations

Page 27: Introducing  z -scores & the normal distribution

Statistics 1 27

Independence of observations

The probability of an element being in the sample does NOT depend on any other element's inclusion.

Event classes must be mutually exclusive & exhaustive

Mutually exclusive: no elementary element can be a member of both event classes.

Exhaustive: every element drawn can be categorized as one or the other event class.

Page 28: Introducing  z -scores & the normal distribution

Statistics 1 28

Rules for working with binomial probabilitiesAdditive rule/“OR rule”: To calculate the probability of selecting one event class OR the

other, ADD the probabilities together.P(A OR B)= P(A) + P(B)-P(A & B together)For binomial data, A & B can never occur together, so P(A & B together) always = 0.

For the sample of 6 rats in the Y-maze, what is the probability that the first animal will turn right OR turn left?

p(right turn by rat # 1) = .5 p(left turn by rat # 1) = .5 p (right OR left turn by rat # 1) = .5 +.5-0=1.0

Page 29: Introducing  z -scores & the normal distribution

Statistics 1 29

Multiplicative rule/ “AND rule”:To calculate the probability of selecting a particular sequence of

event classes, MULTIPLY the probabilities together.P(A & B) = P(A)*P(B)

For the sample of 6 rats in the Y-maze, what is the probability that the first animal & the second animal will both turn right?

p(right turn by rat # 1) = .5 p (right turn by rat # 2) = .5 p (right turn by rat # 1 AND rat #2) = .5x.5=.25 Notice that when the probabilities multiplied together are the

same, you can summarize this using an exponent.p (right turn by rats # 1, #2 & #3) = .53=

Page 30: Introducing  z -scores & the normal distribution

Statistics 1 30

We can use this observation to calculate the probability of any particular sequence containing X # of observations of interest:

P(any sequence with a given X)= pXqn-X X= # of observations of interest, n= # of events or trials

p= probability of observing an X on any 1 trial, q= probability of observing a “not X” on any 1 trialWe will use both the “AND rule” & the “OR rule” to specify the

exact probability of observing any given X.Then we can decide whether an observation is “unusual.”

The exact probability of a given X will depend on:The probability of getting any sequence that contains the X &The # of different sequences that contain the X.

Page 31: Introducing  z -scores & the normal distribution

Statistics 1 31

Calculating the probability of a particular sequenceImagine a multiple choice test with 4 questions where each question

has 5 options (a, b, c, d, or e). We are interested in correct answers.We want to compare our observations to what we would expect

based on chance / “guessing.”P(any sequence with a given X)= pXqn-X

n= # of events or trials = 4p(correct on any 1 question)=1/5=.20q(incorrect on any 1 question)=4/5=.80X= # of correct answersThe probability of observing a sequence containing X =2 correct

answers is:.22 X .82 = .0256

Page 32: Introducing  z -scores & the normal distribution

Statistics 1 32

This is the probability of any sequence containing 2 correct answers without regard to order.

Different combinations of C's (corrects) and I's (incorrects) can give us a total of X=2.

For example:

p(CCII)=.2 X .2 X .8 X.8 = .0256.

p(CICI)=.2 X .8 X .2 X .8 = .0256.

All sequences with X=2, for n=4 & p=.20 have the same probability, .0256.

Page 33: Introducing  z -scores & the normal distribution

Statistics 1 33

Using the formula for combinations to calculate how many different sequences contain a particular XO

The formula pXqn-X gives you the probability of ANY sequence with a given X.

You still need to know how many different sequences or combinations of n elements will give you that same Xo.

For n = 4 & X =2, there are 6 combinations of "Cs" and "I's" which would give us an Xo =2 correct answers.

1) CCII 3) IICC 5) ICCI2) CICI 4) ICIC 6) CIIC