1 A Rorschach Test. S. Stanley Young, NISS Jessie Q. Xia, NISS Banff, Canada Dec 15, 2011 Variable...
-
Upload
jeffery-worthman -
Category
Documents
-
view
218 -
download
2
Transcript of 1 A Rorschach Test. S. Stanley Young, NISS Jessie Q. Xia, NISS Banff, Canada Dec 15, 2011 Variable...
1
A Rorschach Test
S. Stanley Young, NISSJessie Q. Xia, NISS
Banff, CanadaDec 15, 2011
Variable Importancein Environmental Studies
Current Challenges in Statistical Learning
1. Statistical methods
2. Data quality
3. Invalid claims
a. Multiple testing
b. Multiple modeling
c. Bias
Great Smog of '52 or Big Smoke12,000 estimated deaths
Pope et al. 2009
6
Studied Variables
Life Expectancy life-table methods
Per capita income (in thousands of $)
Lung Cancer (Age standardized death rate)
COPD (Age standardized death rate)
High-school graduates (proportion of population)
PM2.5 (μg/m3)
Black population (proportion of population)
Population (in hundreds of thousands)
5-Year in-migration (proportion of population)
Hispanic population (proportion of population)
Urban residence (proportion of population)
7
First Analysis, Regression
Variable SS First SS Last
Income 31.8 15.6
Lung Cancer 22.4 5.1
COPD 21.5 4.1
High School 15.9 0.0
Population 9.4 5.2
PM2.5 9.4 5.8
Hispanic 4.3 2.4
Black 3.1 1.7
Urban 1.4 0.8
Migration 0.0 0.8
8
Recursive Partitioning
9
Variable Importance
Variable Regression RP
Income 0.3390 0.2108
COPD 0.1621 0.1199
Lung Cancer 0.1768 0.1467
PM2.5 0.0732 0.1302
High School 0.0997 0.1066
%Black 0.0537 0.0319
Pop Density 0.0418 0.0793
%Hispanic 0.0177 0.0136
Migration 0.0228 0.0202
Urban 0.0133 0.0105
10
East versus West
Krewski et al. 2000 Health Effects In.
Enstrom 2005 Inhalation Toxicology
Bell et al. 2007 Env Health Pers
Smith et al. 2009 Inhalation Toxicology
Jerrett 2010 CARB workshop
11
Fine particles and Mortality
Pope co-author, 2000.
12
Ozone and Mortality
13
Variable Importance
Regression Recursive Partitioning
14
Longevity versus PM2.5
East : Gray West : Red
15
Longevity versus Income
16
Hans Rosling's 200 Countries, 200 Years
http://www.youtube.com/watch?v=jbkSRLYSojo
17
Summary to this point
Income is very important.
PM2.5 is 4th or 5th in importance.
PM2.5 is not important in West.
Pope knew or should have known the East/West heterogeneity.
18
E1: Breakfast cereal and boy babies
19
P-value plot
E2 : Peto, NEJM, statins and cancer
Hypothesis: The (SEAS) trial has raised the hypothesis that adding ezetimibe to statin therapy might increase theincidence of cancer.
The claim fails to replicate.
The relative risk is wide (95% CI, 1.13 to 2.12; 99% CI, 1.02 to 2.33; uncorrected P = 0.006 before any allowance is made for this being the hypothesis-generating result. NB: 16 x 0.006 = 0.098.
SEAS New Studies
E3: A multiple testing and modeling train wreck
1. 275 chemicals2. 32 medical outcomes3. 10 demographic covariates
275 x 32 = 8800 x 2^10 = ~9 million
A CDC “systems” train wreck in progress!
JAMA
Author Interpretation
There exists an increased risk of myocardial infarction in patients exposed to abacavir and didanosinewithin the preceding 6 months.
E4 : Bias Example: Lancet DAD study
First drug use (Text, page 1422, and Table 3)
25
E4 : BMJ versus JAMA (1)
Conclusion: The risk of oesophageal cancer increased with 10 or more prescriptions for oral bisphosphonates and with prescriptions over about a five year period.
BMJ 2010; 341:c4444
26
E4: BMJ versus JAMA (2)
Conclusion: Oral bisphosphonates was not significantly associated with incident of esophageal or gastric cancer.
JAMA 2010; 304(6): 657 663‐
27
A Rorschach Test
With large, complex data sets, there is enough flexibility to get what you want/need.
28
Consumer Wishes
Honest science
Valid claims
Claims in context
+ and – of data and methods
29
What do we have? (Deming)
A systems failure.
Essentially no process control.
Journals operating by “quality by inspection”.
Workers are happy.
Management failure.
30
What to do?
Funding agencies need to require data access on publication.
Editors need to give up quality by inspection require split sample strategy require number of claims at issue.
31
Statisticians
Eventually society will figure it out;
Scientific claims are (most) often wrong.
Essentially all claims are supported by statistics.
Society will ask, “Where were the statisticians?”