Biostat. 200 Review slides Week 1-3. Recap: Probability.
-
Upload
ira-campbell -
Category
Documents
-
view
217 -
download
1
Transcript of Biostat. 200 Review slides Week 1-3. Recap: Probability.
Biostat. 200Review slides
Week 1-3
Recap: Probability
Basic Probability
1) ComplementP(A)= 1-P(Ā)
2) Intersection = P(A ∩ B)
3) Union = P(A U B) P(A U B) =P(A) + P(B) – P(A ∩ B)
Basic Probability4) Mutually exclusivity (but still dependant)
• Mutual exclusivity = Additive RuleP(A ∩ B) = 0P(A U B) = P(A) + P(B) - P(A ∩ B) = P(A) + P(B)
4
Basic Probability5) Conditional Probability
The probability that an event B will occur given that event A has occurred.
• Use the multiplicative rule– P(A ∩ B) = P(A) P(B|A)– P(B|A) = P(A ∩ B) / P(A)
• Applies to – Relative risks – Odds ratios
Basic Probability
6) IndependenceNote that independence ≠ mutual exclusivity!
• If A and B are independent: – P(B | A)=P(B | Ā) = P(B)– P(A | B) = P(A|B)= P(A)– P(A ∩ B) = P(A)P(B) (Multiplicative rule)
Probability Distributions
Discrete distributionsContinuous distributions
Discrete Variables
• For discrete variables the probability distribution describes the probability of each possible value
8
Discrete distributions
• Bernoulli distribution • variable that can take on one of two values with a
constant probability p, then it is a Bernoulli random variable
• outcomes are either 0 or 1• theoretical building block to describe the
distribution of more than one trial.
Discrete Distributions
Binomial Distribution:
With:• p is probability of “success” in each “trial”• n is the number of “trials” • n and p are the parameters of the binomial
distribution, (summarize the distribution)• x is the number of “successes” (outcomes)• Note that Stata and Table A.1 use the symbol k for x
xnx ppx
nxXP
)1()(
10
Binominal Distributions
• Assumes
– Fixed number of trials n, each with one of two mutually exclusive outcomes
– Independent outcomes of the n trials– Constant probability of success p for each trial
Binominal Distribution
• What is the probability of exactly 2 cases of disease in a sample of n=5 where p=0.15?
• How to calculate the probability?1) Use the binomial formulaIn Stata: display comb(n,k). display comb(5,2)10
– (10)(0.15)2 (1-0.15)5-2
– (10)(0.0225) (0.614) = 0.138
xnx ppx
nxXP
)1()(
Binominal Distribution
• What is the probability of exactly 2 cases of disease in a sample of n=5 where p=0.15?
• How to calculate the probability?2) Use Table A1 – Table A.1 gives you P(X=k)– Look up p=.15, n=5, k=2, answer=.1382
Binominal Distribution
• What is the probability of exactly 2 cases of disease in a sample of n=5 where p=0.15?
• How to calculate the probability?3) Use Stata
• Binomialp (n,k,p)• display binomialp(5,2,.15).13817813
Binomial DistrubutionWhat is the probability of 1 or more cases of disease in a sample of n=5 where p=0.15?1) Use Binomial Formula•P(X≥1) = 1-P(X=0)
di comb(5,1)*0.15^1.85^5.44370531
•So 1-P(X=0) = 1- 0.4437 = 0.5563
15
550 85.*1)85(.15.0
5)0(
XP
Binomial DistrubutionWhat is the probability of 1 or more cases of disease in a sample of n=5 where p=0.15?
2) Use Table A1•P(X≥1) = 1-P(X=0) •Looking up P(X=0) we get 0.4437
– So 1-P(X=0) = 1- 0.4437 = 0.5563
16
Binominal Distribution
• What is the probability of 1 or more cases of disease in a sample of n=5 where p=0.15?
• How to calculate the probability?3) Use Stata– display binomialtail(5,1,.15)
.55629469
Binomial Distribution• Binomial mean = np• Binomial variance= np(1-p)
– Variance is largest when p=0.5, smaller when p closer to 0 or 1– The distribution is symmetric when p=0.5– The distribution is a mirror image for 1-p (i.e. the distribution for p=0.05 is the
mirror image of the one for p=0.95)
18
0.1
.2.3
.4bin
om
ial pro
bability
0 2 4 6 8 10 12 14 16 18 20n successes
Binomial distribution n=20 p=.05
0.1
.2.3
.4
bin
om
ial pro
bability
0 2 4 6 8 10 12 14 16 18 20n successes
Binomial distribution n=20 p=.95
0.0
5.1
.15
.2
bin
om
ial pro
bability
0 2 4 6 8 10 12 14 16 18 20n successes
Binomial distribution n=20 p=.5
Continuous distributions
Normal distribution
Continuous Distribution
• For continuous variables, the distribution describes the probability of a range of values
Normal distribution• The probability density function is
• μ is the mean and σ is the standard deviation of a normally distributed random variable– They are the parameters of the normal distribution– π is the constant that is approximately 3.14159
x -exf
x
where2
1)(
2
2
1
21
-10 -8 -6 -4 -2 0 2 4 6 8 10x
Mean0SD1 Mean0SD3Mean4SD1
Several normal distributions
22
The Standard Normal Distribution
• μ and σ can take on an infinite number of values
• standard curve with– μ =0 – σ =1 (and variance σ2=1).
• Denoted N(0,1)
23
x -exfx
where2
1)(
2
2
1
The Standard Normal Distribution
• If X is a normally distributed random variable with mean μ and standard deviation σ then
Z= (X – μ)/σ
is a standard normal random variable
• That is, a normally distributed random variable with its mean subtracted off, divided by its standard deviation, is a normal random variable with mean=0 and standard deviation=1
24
For Z ~ N(0,1) P(Z≥0) = 0.50
25
-5 -4 -3 -2 -1 0 1 2 3 4 5Z
Standard normal distribution
Zero is the mean & medianFor a standard normal distribution
For Z ~ N(0,1) P(Z≥1.96) = 0.025
26
-5 -4 -3 -2 -1 0 1 2 3 4 5Z
Standard normal distribution
Probability of observing a value of 1.96 or greater is 0.025
P(µ-2σ ≤ Z ≤ µ+2σ)
Remember µ=0 and σ=1, so this is
P(-2 < Z < 2) = 0.954
Therefore, approximately 95.4% of the area of the standard normal is within 2 SD of the mean.
0.0230.023
27
0.954
-5 -4 -3 -2 -1 0 1 2 3 4 5Z
Standard normal distribution
•Stata will calculate standard normal probabilities for you
•In Stata, the left portion of the curve P(Z<z) is calculated for you.display normal(1.96).9750021
•If you want the right hand portion of the curve, P(Z>z), you subtract your answer from 1display 1-normal(1.96).0249979
•If you want the middle: display normal(1.96) -normal(-1.96).95000421
28
-5 -4 -3 -2 -1 0 1 2 3 4 5Z
Prob Z<1.96 highlighted
Standard normal distribution
• To get the z value for P(Z<z) = p usedisplay invnormal(p)
• To get the z value for P(Z>z) = p usedisplay invnormal(1-p)
E.g. what is the z value for P(Z≤z) = 0.025. display invnormal(0.025)-1.959964
E.g. what is the z value for P(Z>z) = 0.025. display invnormal(1-.025)1.959964
Finding z values for probabilities in Stata
29
• To get the z value for P(Z>z) = p – find p in the table and read the corresponding z
• To get the z value for P(Z<z) = p – find p and use -1* the corresponding p
E.g. what is the z value for P(Z≤z) = 0.025For p=0.025 the table value is 1.96, so the answer is -1.96
E.g. what is the z value for P(Z>z) = 0.025For p=0.025 the table value is 1.96
Finding z values for probabilities in using Table A.3
30