Sequential Multiple Decision Procedures (SMDP) For PGRN mini-GAW Data
description
Transcript of Sequential Multiple Decision Procedures (SMDP) For PGRN mini-GAW Data
Sequential Multiple Decision ProceduresSequential Multiple Decision Procedures (SMDP) (SMDP)
For PGRN mini-GAW DataFor PGRN mini-GAW Data
Q. Zhang and M.A. Province Q. Zhang and M.A. Province
Division of BiostatisticsDivision of BiostatisticsWashington University School of MedicineWashington University School of Medicine
PGRN AWSII, ChicagoPGRN AWSII, Chicago April 28, 2005April 28, 2005
Brief History of SMDPBrief History of SMDP
Sequential Probability Ratio Test (SPRT), Wald 1947Sequential Probability Ratio Test (SPRT), Wald 1947
SMDP, Bechhoffer, Kiefer, Sobel, 1968SMDP, Bechhoffer, Kiefer, Sobel, 1968
SMDPSMDP Haseman-Elston Method (ASP), Province 2000 Haseman-Elston Method (ASP), Province 2000
Idea 1: SequentialIdea 1: Sequential
nn00Start from a small sample size
Increase sample size one by one
Use sequential information
Do analysis at each time
Stop when conclusion is reached
nn00+1+1
nn00+2+2
nn00+i+i
…
Plan experiments in next stage and save resources
Use residual/extra data to do validation
…
Idea 2: Multiple DecisionIdea 2: Multiple Decision
SNP1SNP1
SNP2SNP2
SNP3SNP3
SNP4SNP4
SNP5SNP5
SNP6SNP6
……
SNPnSNPn
Multiple DecisionMultiple DecisionIndependent TestIndependent Test
Test 1
Test 2
Test 3
Test 4
Test 5
Test 6
Test n
SNP1SNP1
SNP2SNP2
SNP3SNP3
SNP4SNP4
SNP5SNP5
SNP6SNP6
……
SNPnSNPn
Tradeoff between false positive rate and false negative rate.
SMDP divides populations into two groups and guarantees that there is a real difference (D) between the two groups with probability P*.
Model: Model: Regression ModelRegression Model
SNP GenotypesSNP Genotypes
11 => 011 => 012 => 112 => 122 => 222 => 2
PhenotypesPhenotypes
NPUBSNPUBS
NDRIVEL NDRIVEL
PCTDRIVELPCTDRIVEL
RIVALSIDERIVALSIDE
GOTGRANTSGOTGRANTS
YY = = αα+ + ββX X ++ ЄЄ
SNP x Drug/TreatmentSNP x Drug/Treatment
11 => 0 ALCOHOL11 => 0 ALCOHOL
12 => 1 X 12 => 1 X OROR
22 => 2 IVY22 => 2 IVY
SNP Main effectsSNP Main effects SNP-Drug interactionsSNP-Drug interactions
MethodsMethodsFixed sample Fixed sample RegressionRegression(Entire data)(Entire data)
Regular P value Regular P value
Bonferroni corrected P valueBonferroni corrected P value
FDR corrected P valueFDR corrected P value
SMDPSMDP(Sequential)(Sequential)
nn00
nn00+1+1
nn00+h+h
…
Q[h]1Q[h]1
Q[h]2Q[h]2
Q[h]3Q[h]3
Q[h]4Q[h]4
Q[h]5Q[h]5
……
……
nn00+2+2
…
Sequential sum of squares of Sequential sum of squares of regression errorsregression errors
StoppingStopping
rulerule
PPProbability thatProbability that
a real differencea real difference
(D) between two(D) between two
groups existsgroups exists
Region 3 (86 SNPs), GOTGRANTS
Region 2 (28 SNPs, IVY-SNP), RIVALSIDE
Region 1 (74 SNPs, ALCOHOL-SNP), PCTDRIVEL
SMDP SummarySMDP Summary
Test, identify all signals simultaneously Test, identify all signals simultaneously No multiple comparisonsNo multiple comparisons
Tight control statistical Tight control statistical errors (Type I, II)errors (Type I, II)
Efficient, Sequential information Efficient, Sequential information “Minimal” N to find significant signals“Minimal” N to find significant signals
Reliable, Save rest of N for validationReliable, Save rest of N for validation
Thanks !Thanks !
Fixed sample regression Fixed sample regression
Region 3 (86 SNPs), GOTGRANTS
Fixed sample regressionFixed sample regression Region 2 (28 SNPs, IVY-SNP), RIVALSIDE
Fixed sample regressionFixed sample regression Region 1 (74 SNPs, ALCOHOL-SNP), PCTDRIVEL
Fixed sample regression of 5 phenotypes on genotypes of 188 SNPsFixed sample regression of 5 phenotypes on genotypes of 188 SNPs