Probability distributions, sampling distributions and central limit theorem

31
Probability distributions Dr. S. A. Rizwan, M.D. Public Health Specialist SBCM, Joint Program – Riyadh Ministry of Health, Kingdom of Saudi Arabia

Transcript of Probability distributions, sampling distributions and central limit theorem

Probability distributionsDr. S. A. Rizwan, M.D.

PublicHealthSpecialistSBCM, JointProgram– Riyadh

MinistryofHealth,KingdomofSaudiArabia

Learningobjectives

Demystifying statistics! – Lecture 2 SBCM, Joint Program – RiyadhSBCM, Joint Program – Riyadh

• Defineprobabilitydistributions• Describethecommontypesofprobabilitydistributions• Describesamplingdistribution• Understandthecentrallimittheorem

Probabilitydistribution

Demystifying statistics! – Lecture 2 SBCM, Joint Program – RiyadhSBCM, Joint Program – Riyadh

• Probabilitydistributionisamathematicalfunctionthatcanbethoughtofasprovidingtheprobabilityofoccurrenceofdifferentpossibleoutcomesinanexperiment.

• Thedistributionofastatisticaldataset(orapopulation)isalistingorfunctionshowingallthepossiblevalues(orintervals)ofthedataandhowoftentheyoccur.

Probabilitydistribution

Demystifying statistics! – Lecture 2 SBCM, Joint Program – RiyadhSBCM, Joint Program – Riyadh

Section1:Binomialdistribution

Demystifying statistics! – Lecture 2 SBCM, Joint Program – RiyadhSBCM, Joint Program – Riyadh 5

Binomialdistribution

Demystifying statistics! – Lecture 2 SBCM, Joint Program – RiyadhSBCM, Joint Program – Riyadh

• Followingconditionsneedtobesatisfiedforabinomialexperiment/distribution:• Thereisafixednumberofntrialscarriedout.• Theoutcomeofagiventrialiseithera“success”or“failure”.

• Theprobabilityofsuccess(p)remainsconstantfromtrialtotrial.

• Thetrialsareindependent, theoutcomeofatrialisnotaffectedbytheoutcomeofanyothertrial.

Binomialdistribution– example

Demystifying statistics! – Lecture 2 SBCM, Joint Program – RiyadhSBCM, Joint Program – Riyadh

• Supposewehaven=40patientswhowillbereceivinganexperimentaltherapywhichisbelievedtobebetterthancurrenttreatmentswhichhistoricallyhavehada5-yearsurvivalrateof20%,i.e.theprobabilityof5-yearsurvivalisp=0.20

• Thusthenumberofpatientsoutof40inourstudysurvivingatleast5yearshasabinomialdistribution,i.e.X~BIN(40,0.20)

Binomialdistribution– example

Demystifying statistics! – Lecture 2 SBCM, Joint Program – RiyadhSBCM, Joint Program – Riyadh

• Supposethatusingthenewtreatmentwefindthat16outofthe40patientssurviveatleast5yearspastdiagnosis.

• Q:Doesthisresultsuggestthatthenewtherapyhasabetter5-yearsurvivalratethanthecurrent,i.e.istheprobabilitythatapatientsurvivesatleast5yearsgreaterthan.20ora20%chancewhentreatedusingthenewtherapy?

Binomialdistribution– example

Demystifying statistics! – Lecture 2 SBCM, Joint Program – RiyadhSBCM, Joint Program – Riyadh

• Weessentiallyaskourselvesthefollowing:

• Ifweassumethatnewtherapyisnobetterthanthecurrentwhatistheprobabilitywewouldseetheseresultsbychancevariationalone?

• Morespecificallywhatistheprobabilityofseeing16ormoresuccessesoutof40ifthesuccessrateofthenewtherapyis.20or20%aswell?

Binomialdistribution– example

Demystifying statistics! – Lecture 2 SBCM, Joint Program – RiyadhSBCM, Joint Program – Riyadh

• Thisisabinomialexperimentsituation

• Therearen=40patientsandwearecountingthenumberofpatientsthatsurvive5ormoreyears.TheindividualpatientoutcomesareindependentandIFWEASSUMEthenewmethodisNOTbetter,thentheprobabilityofsuccessisp=.20or20%forallpatients.

• SoX=#of“successes”intheclinicaltrialisbinomialwithn=40andp=0.20,i.e.X~BIN(40,0.20)

Binomialdistribution– example

Demystifying statistics! – Lecture 2 SBCM, Joint Program – RiyadhSBCM, Joint Program – Riyadh

• X~BIN(40,.20),findtheprobabilitythat16ormorepatientssurviveatleast5years. probabilities are computed

automatically for greater than or equal to and less than or equal to x.

Enter n = sample sizex = observed # of “successes”p = probability of “success”

Binomialdistribution– example

Demystifying statistics! – Lecture 2 SBCM, Joint Program – RiyadhSBCM, Joint Program – Riyadh

• Thechancethatwewouldsee16ormorepatientsoutof40survivingatleast5yearsifthenewmethodhasthesamechanceofsuccessasthecurrentmethods(20%)isVERYSMALL,0.0029.

Section2:Normaldistribution

Demystifying statistics! – Lecture 2 SBCM, Joint Program – RiyadhSBCM, Joint Program – Riyadh 13

Normaldistribution

Demystifying statistics! – Lecture 2 SBCM, Joint Program – RiyadhSBCM, Joint Program – Riyadh

• Thenormaldistributionisadescriptivemodelthatdescribesrealworldsituations.

• Itisdefinedasacontinuousfrequencydistributionofinfiniterange(cantakeanyvalue).

• Thisisthemostimportantprobabilitydistributioninstatisticsandimportanttoolinanalysisofepidemiologicaldata

Normaldistribution- properties

Demystifying statistics! – Lecture 2 SBCM, Joint Program – RiyadhSBCM, Joint Program – Riyadh

• Thenormaldistributionisdefinedbytwoparameters,μandσ.• Youcandrawanormaldistributionforanyμandσcombination.• Thereisonenormaldistribution,Z,thatisspecial.• Ithasμ=0andσ=1.• Alsocalledstandardnormaldistribution.

Normaldistribution- properties

Demystifying statistics! – Lecture 2 SBCM, Joint Program – RiyadhSBCM, Joint Program – Riyadh

• Mean=Median=Mode• Spread determinedbySD• Bell-shaped• Symmetryaboutthecenter• 50%ofvalueslessthanthemeanand50%greaterthanthemean

• Itapproacheshorizontalaxisasymptotically:- ∞<X<+∞

• Areaunderthecurveis1

Normaldistribution- properties

Demystifying statistics! – Lecture 2 SBCM, Joint Program – RiyadhSBCM, Joint Program – Riyadh

Normaldistribution- properties

Demystifying statistics! – Lecture 2 SBCM, Joint Program – RiyadhSBCM, Joint Program – Riyadh

Normaldistribution– example

Demystifying statistics! – Lecture 2 SBCM, Joint Program – RiyadhSBCM, Joint Program – Riyadh

• Assumingthenormalheartrate(H.R)innormalhealthyindividualsisnormallydistributedwithMean=70andStandardDeviation=10

• Q1.Whatareaunderthecurveisabove80beats/min?• Q2.Whatareaofthecurveisabove90beats/min?• Q3.Whatareaofthecurveisbetween50-90beats/min?• Q4.Whatareaofthecurveisabove100beats/min?• Q5.Whatareaofthecurveisbelow40beatsperminorabove100beatspermin?

Normaldistribution– example

Demystifying statistics! – Lecture 2 SBCM, Joint Program – RiyadhSBCM, Joint Program – Riyadh

Normaldistribution– example

Demystifying statistics! – Lecture 2 SBCM, Joint Program – RiyadhSBCM, Joint Program – Riyadh

Section3:Samplingdistribution

Demystifying statistics! – Lecture 2 SBCM, Joint Program – RiyadhSBCM, Joint Program – Riyadh 22

Samplingdistribution

Demystifying statistics! – Lecture 2 SBCM, Joint Program – RiyadhSBCM, Joint Program – Riyadh

• Samplingdistributionofthemean– Atheoreticalprobabilitydistributionofsamplemeansthatwouldbeobtainedbydrawingfromthepopulationallpossiblesamplesofthesamesize.

Samplingdistribution

Demystifying statistics! – Lecture 2 SBCM, Joint Program – RiyadhSBCM, Joint Program – Riyadh

CentralLimitTheorem

Demystifying statistics! – Lecture 2 SBCM, Joint Program – RiyadhSBCM, Joint Program – Riyadh

• Nomatterwhatwearemeasuring,thedistributionofanymeasureacrossallpossiblesampleswecouldtakeapproximatesanormaldistribution,aslongasthenumberofcasesineachsampleisabout30orlarger.

• Ifwerepeatedlydrewsamplesfromapopulationandcalculatedthemeanofavariableorapercentageor,thosesamplemeansorpercentageswouldbenormallydistributed.

• ItenablesustocalculateStandarderrorfromasinglesample

Section4:Percentiles

Demystifying statistics! – Lecture 2 SBCM, Joint Program – RiyadhSBCM, Joint Program – Riyadh 26

Percentiles

Demystifying statistics! – Lecture 2 SBCM, Joint Program – RiyadhSBCM, Joint Program – Riyadh

• Valuebelowwhichapercentageofdatafalls.• Forexample:80%ofpeopleareshorterthanyou,Thatmeansyouareatthe80thpercentile.Ifyourheightis1.85mthen"1.85m"isthe80thpercentileheightinthatgroup.

Percentiles

Demystifying statistics! – Lecture 2 SBCM, Joint Program – RiyadhSBCM, Joint Program – Riyadh

Percentiles

Demystifying statistics! – Lecture 2 SBCM, Joint Program – RiyadhSBCM, Joint Program – Riyadh

• Quantilesarecutpoints dividingtherangeofaprobabilitydistributionintocontiguousintervalswithequalprobabilities

• Median,tertiles,quartiles,quintiles,sextiles,septiles,octiles,deciles,percentilesorcentiles

• Inter-quartilerange

Takehomemessages

Demystifying statistics! – Lecture 2 SBCM, Joint Program – RiyadhSBCM, Joint Program – Riyadh

• Understandingthedistributionsletsusunderstandtheinferentialstatisticsbetter