Psyc 235: Introduction to Statistics

Psyc 235:Introduction to

Statistics

DON’T FORGET TO SIGN IN FOR CREDIT!

http://www.psych.uiuc.edu/~jrfinley/p235/

Independent vs. Dependent Events

• Independent Events: unrelated events that intersect at chance levels given relative probabilities of each event

• Dependent Events: events that are related in some way

• So... how to tell if two events are independent or dependent? Look at the INTERSECTION: P(AB)

• if P(AB) = P(A)*P(B) --> independent• if P(AB) P(A)*P(B) --> dependent

Random Variables

• Random Variable: variable that takes on a particular

numerical value based on outcome of a random experiment

• Random Experiment (aka Random Phenomenon):

trial that will result in one of several possible outcomes

can’t predict outcome of any specific trial can predict pattern in the LONG RUN

Random Variables

• Example:• Random Experiment:

flip a coin 3 times

• Random Variable:# of heads

Random Variables

• Discrete vs Continuous finite vs infinite # possible outcomes

• Scales of MeasurementCategorical/NominalOrdinal IntervalRatio

Data World vs. Theory World

• Theory World: Idealization of reality (idealization of what you might expect from a simple experiment) Theoretical probability distribution POPULATION parameter: a number that describes the

population. fixed but usually unknown

• Data World: data that results from an actual simple experiment Frequency distribution SAMPLE statistic: a number that describes the sample

(ex: mean, standard deviation, sum, ...)

So far...

• Graphing & summarizing sample distributions (DESCRIPTIVE)

• Counting Rules• Probability• Random Variables• one more key concept is needed to start

doing INFERENTIAL statistics:

SAMPLING DISTRIBUTION

Binomial Situation

• Bernoulli Trial a random experiment having exactly two possible

outcomes, generically called "Success" and "Failure” probability of “Success” = p probability of “Failure” = q = (1-p)

Heads Tails Good RobotBad

Robot

Examples:

Coin toss: “Success”=Headsp=.5

Robot Factory:“Success”=Good Robotp=.75

Binomial Situation

• Binomial Situation:n: # of Bernoulli trials trials are independentp (probability of “success”) remains

constant across trials

• Binomial Random Variable:X = # of the n trials that are

“successes”

Binomial Situation:collect data!

Population:Outcomes of all possible coin tosses

(for a fair coin)Success=Heads p=.5

Let’s do 10 tosses n=10 (sample size)

Bernoulli Trial: one coin toss

Binomial Random Variable:X=# of the 10 tosses that come up heads

(aka Sample Statistic)Sample: X = ....

Binomial Distributionp=.5, n=10

0.00

0.05

0.10

0.15

0.20

0.25

0.30

0 1 2 3 4 5 6 7 8 9 10

# of successes

probability

This is theSAMPLING DISTRIBUTION

of X!

Sampling Distribution

• Sampling Distribution:Distribution of values that your sample

statistic would take on, if you kept taking samples of the same size, from the same population, FOREVER (infinitely many times).

•Note: this is a THEORETICAL PROBABILITY DISTRIBUTION







(aka Sample Statistic)Sample: X = ....3 5 6

0

0.05

0.1

0.15

0.2

0.25

0.3

0 1 2 3 4 5 6 7 8 9 10

# of successes

probability








(aka Sample Statistic)Sample: X = 3

0

0.05

0.1

0.15

0.2

0.25

0.3

0 1 2 3 4 5 6 7 8 9 10

# of successes

probability


Binomial Formula

€

P(X = k) = P(exactly k many successes)

€

P(X = k) =n

k

⎛

⎝ ⎜

⎞

⎠ ⎟pk (1− p)n−k

BinomialRandomVariable

specific # ofsuccesses youcould get

€

n

k

⎛

⎝ ⎜

⎞

⎠ ⎟=

n!

k!(n − k)!

combinationcalled the

Binomial Coefficient

probabilityof success

probabilityof failure

specific # offailures

Binomial Formula

3

0

0.05

0.1

0.15

0.2

0.25

0.3

0 1 2 3 4 5 6 7 8 9 10

# of successes

probability


p(X=3) =

Remember this idea....

Hmm... what if we had gotten X=0?...pretty unlikely outcome... fair coin?

Population:

Outcomes of all p

ossible coin tosse

s

(for a fair c

oin)

p=.5n=10

More on the Binomial Distribution

• X ~ B(n,p)

€

Expected Value

and Variance for X~B(n,p)

μX = np

σ X2 = np(1− p)

Standard Deviation : σ X = np(1− p)

these are theparameters forthe samplingdistribution of X

# heads in 5 tosses of a coin: X~B(5,1/2)

Expectation Variance Std. Dev.# heads in 5 tosses of a coin: 2.5 1.25 1.12

Ex:

Let’s see some moreBinomial Distributions

• What happens if we try doing a different # of trials (n) ?

• That is, try a different sample size...

Binomial Distribution, p=.5, n=5

0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0 1 2 3 4 5

# of successes

probability


0

0.05

0.1

0.15

0.2

0.25

0.3

0 1 2 3 4 5 6 7 8 9 10

# of successes

probability


0

0.02

0.04

0.06

0.08

0.1

0.12

0.14

0.16

0.18

0.2

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

# of successes

probability


0

0.02

0.04

0.06

0.08

0.1

0.12

0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 34 36 38 40 42 44 46 48 50

# of successes

probability


0

0.01

0.02

0.03

0.04

0.05

0.06

0.07

0.08

0.09

0 3 6 912 15 18 21 24 27 30 33 36 39 42 45 48 51 54 57 60 63 66 69 72 75 78 81 84 87 90 93 96 99

# of successes

probability

Whoah.

• Anyone else notice those DISCRETE distributions starting to look smoother as sample size (n) increased?

• Let’s look at a few more binomial distributions, this time with a different probability of success...

Binomial Robot Factory

• 2 possible outcomes:Good Robot

90%Bad Robot10%

You’d like to know about how many BAD robots you’re likely to get before placing an order... p = .10 (... “success”)

n = 5, 10, 20, 50, 100


0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0 1 2 3 4 5

# of successes

probability


0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0.45

0 1 2 3 4 5 6 7 8 9 10

# of successes

probability


0

0.05

0.1

0.15

0.2

0.25

0.3

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

# of successes

probability


0

0.02

0.04

0.06

0.08

0.1

0.12

0.14

0.16

0.18

0.2

0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 34 36 38 40 42 44 46 48 50

# of successes

probability


0

0.02

0.04

0.06

0.08

0.1

0.12

0.14

0 3 6 912 15 18 21 24 27 30 33 36 39 42 45 48 51 54 57 60 63 66 69 72 75 78 81 84 87 90 93 96 99

# of successes

probability

Normal Approximation of the Binomial

If n is large, then

X ~ B(n,p) {Binomial Distribution}

can be approximated by a NORMAL DISTRIBUTION with parameters:

€

μ =np

σ = np(1− p)

0

0.05

0.1

0.15

0.2

0.25

0.3

probability

Normal Distributions

• (aka “Bell Curve”)• Probability Distributions of a Continuous

Random Variable (smooth curve!)

• Class of distributions, all with the same overall shape

• Any specific Normal Distribution is characterized by two parameters: mean: μ standard deviation:

differentmeans

differentstandarddeviations

Standardizing

• “Standardizing” a distribution of values results in re-labeling & stretching/squishing the x-axis

• useful: gets rid of units, puts all distributions on same scale for comparison

• HOWTO: simply convert every value to a:Z SCORE:

€

z =x − μ

σ

Standardizing

• Z score:

• Conceptual meaning: how many standard deviations from the mean

a given score is (in a given distribution)

• Any distribution can be standardized• Especially useful for Normal

Distributions...€

z =x − μ

σ

Standard Normal Distribution

• has mean: μ=0• has standard deviation: =1• ANY Normal Distribution can be

converted to the Standard Normal Distribution...

StandardNormalDistribution

Normal Distributions & Probability

• Probability = area under the curve intervals cumulative probability [draw on board]

• For the Standard Normal Distribution: These areas have already been

calculated for us (by someone else)

Standard Normal Distribution

So, if this were a Sampling Distribution, ...

Next Time

• More different types of distributionsBinomial, Normal t, Chi-square F

• And then... how will we use these to do inference?

• Remember: biggest new idea today was:SAMPLING DISTRIBUTION

Psyc 235: Introduction to Statistics

Documents

Transcript of Psyc 235: Introduction to Statistics