Post on 01-Jan-2016
Estimating Signal with Next Generation Estimating Signal with Next Generation Affymetrix Software Affymetrix Software
Earl Hubbell, Ph.D.Earl Hubbell, Ph.D.Principal Statistician, Applied ResearchPrincipal Statistician, Applied Research
Estimating Signal with Next Generation Estimating Signal with Next Generation Affymetrix Software Affymetrix Software
Earl Hubbell, Ph.D.Earl Hubbell, Ph.D.Principal Statistician, Applied ResearchPrincipal Statistician, Applied Research
Quick Review of AvgDiffQuick Review of AvgDiff
• Operates on PM-MMOperates on PM-MM
• Removes largest & smallest valuesRemoves largest & smallest values
• Removes >3 standard deviation valuesRemoves >3 standard deviation values
-1200
-1000
-800
-600
-400
-200
0
200
400
600
800
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
Probe Pairs
Inte
nsit
y PM-MMLower limitUpper limit
Areas for improvementAreas for improvement
• AvgDiff Minimally Robust against Minority AvgDiff Minimally Robust against Minority ProbesProbes
• Negative Values Negative Values ImpossibleImpossible for for Concentration or IntensityConcentration or Intensity
• Negative Values Indicate Bias Is Larger Negative Values Indicate Bias Is Larger than True Effect than True Effect
• Incompatible with Standard Log-Incompatible with Standard Log-TransformationTransformation
Desirable PropertiesDesirable Properties
• Robust against minority probesRobust against minority probes
• Doesn’t yield unphysical results for signalDoesn’t yield unphysical results for signal
• Reasonable predictor of concentrationReasonable predictor of concentration
A simple model for intensityA simple model for intensity
• PM Intensity = Real Signal+ Stray SignalPM Intensity = Real Signal+ Stray Signal
• Real, Stray, PM all non-negativeReal, Stray, PM all non-negative
• log(Real) = log(Real) = log(Affinity) + log(Concentration) + elog(Affinity) + log(Concentration) + e
• (multiplicative error model)(multiplicative error model)
AvgDiff (MAS 4.0)AvgDiff (MAS 4.0)
• PMPM
• Stray Estimate = MMStray Estimate = MM
• Super-Olympic-Super-Olympic-Scoring on PM-MM Scoring on PM-MM (mean like statistic)(mean like statistic)
Making an estimate of signalMaking an estimate of signal- observe PM - observe PM - adjust PM for stray signal- adjust PM for stray signal- value = statistic(adjusted PM)- value = statistic(adjusted PM)
Signal (MAS 5.0)Signal (MAS 5.0)
• PMPM
• Stray Estimate = CT Stray Estimate = CT [best of two estimates][best of two estimates]
• Tukey Biweight on Tukey Biweight on log(PM-CT)log(PM-CT)(median like)(median like)
Handling stray signalHandling stray signal
• PM intensities have stray signal component PM intensities have stray signal component (intensity not due to real signal)(intensity not due to real signal)
• Many MM have similar stray signal to PMMany MM have similar stray signal to PM
• But some MM are not useful for estimation of But some MM are not useful for estimation of stray signalstray signal
• Anomalous MM values can be handled with Anomalous MM values can be handled with imputationimputation
At zero concentration PM has non-zero intensity
As concentration increases, intensity increases
0
100
200
300
400
500
600
700
800
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
Probe
Inte
ns
ity
0pM1pM2pM
Some mismatches don’t tell us about stray signal
0
200
400
600
800
1000
1200
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
Probe
Inte
nsi
ty
PM MM
Model-violating MM values censor real signal information
- Impute typical stray signal for such PM probes
-4
-2
0
2
4
6
8
10
12
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
probe pair
log
(in
ten
sity
)
log(PM-MM) log(PM-Proportion)
Removal of stray signal estimate leaves positive values
0
100
200
300
400
500
600
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
probe
inte
nsi
ty
0 picomolar1 picomolar2 picomolar
Signal calculation (equations)Signal calculation (equations)
• Signal = TukeySignal = Tukeybiweightbiweight(log(Adjusted PM))(log(Adjusted PM))
• Stray = MM (if physically possible)Stray = MM (if physically possible) oror
• log(Stray) = log(PM)-log(Stray proportion) log(Stray) = log(PM)-log(Stray proportion) (if impossible)(if impossible)
• Stray proportion = max(SB, positive)Stray proportion = max(SB, positive)
• SB = TukeySB = Tukeybiweightbiweight(log(PM)-log(MM)) (log(PM)-log(MM)) (“typical” log-ratio)(“typical” log-ratio)
Is signal a reasonable predictor of Is signal a reasonable predictor of concentration?concentration?
• Near linear behaviorNear linear behavior
• Stabilized varianceStabilized variance
Average Signal for 12 human spiked transcripts (3x replicate)
6
8
10
12
14
16
18
-3 -1 1 3 5 7 9 11
log(conc)
log
(sig
nal
)
Signal is near-linear and has stabilized variance inthe middle range of concentrations
8
9
10
11
12
13
14
15
16
17
18
-3 -1 1 3 5 7 9 11
log(concentration)
Sig
nal
Resistance to outliersResistance to outliers
• Introduce 10% artificial outliers to check Introduce 10% artificial outliers to check robustnessrobustness
• Nonparametric correlation to handle both Nonparametric correlation to handle both log-scale and linear-scale datalog-scale and linear-scale data
• Verify data against known spike Verify data against known spike concentrationconcentration
Superior performance against outliers
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Signal AvgDiff
Ken
dal
l co
rrel
atio
n
0% outliers10% outliers
-200
-150
-100
-50
0
50
100
150
200
250
1 2 3 4 5 6 7 8 9
Experiment
Val
ue
AvgDiff Signal
MAS 5.0 more robust against outliers in biological samples
Adrenal Kidney Pancreas
1535_at from Hu95A
SummarySummary
• Mas 5.0 Signal is a reasonable predictor of Mas 5.0 Signal is a reasonable predictor of concentrationconcentration
• Tukey biweight resists outliers Tukey biweight resists outliers
• AvgDiff insufficiently robust in biological AvgDiff insufficiently robust in biological samplessamples
• Log-scale transformation now possibleLog-scale transformation now possible
• Continued algorithm development Continued algorithm development underway...underway...
AcknowledgementsAcknowledgements
• Wei-Min LiuWei-Min Liu
• Fred ChristiansFred Christians
• Tom RyderTom Ryder
• Suzanne DeeSuzanne Dee
• Steve SmeekensSteve Smeekens
• Paul KaplanPaul Kaplan
• Rui MeiRui Mei
• Teresa WebsterTeresa Webster
• Xiaojun DiXiaojun Di
• Ming-hsiu HoMing-hsiu Ho
• Jyoti BaidJyoti Baid
• Chris HarringtonChris Harrington
• Tarif AwadTarif Awad