August 20, 2009
description
Transcript of August 20, 2009
August 20, 2009
NEMO Year 1:From Theory to Application —
Ontology-based analysis of ERP data
http://nemo.nic.uoregon.edu
Overview Agenda
• ICBO highlights (5 mins)• Logistics (5 mins)• ERP pattern analysis methods (20 mins)• ERP measure generation (10 mins)• Linking measures to ontology (10 mins)• Data annotation (deep, ontology-based) (10
mins)
Action items highlighted in lime green!
Overview Agenda
• ICBO highlights (5 mins)• Logistics (5 mins)• ERP pattern analysis methods (20 mins)• ERP measure generation (10 mins)• Linking measures to ontology (10 mins)• Data annotation (deep, ontology-based) (10
mins)
Action items highlighted in lime green!
First International Conference on Biomedical Ontologies (ICBO’09)
http://precedings.nature.com/collections/icbo-2009
First International Conference on Biomedical Ontologies (ICBO’09)
• High-level issues and "best practices" for onto dev't
• Tools that may be of use for NEMO
• Potential collaborations
• Practical Questions/Issues to resolve
Overview Agenda
• ICBO highlights (5 mins)• Logistics (5 mins)• ERP pattern analysis methods (20 mins)• ERP measure generation (10 mins)• Linking measures to ontology (10 mins)• Data annotation (deep, ontology-based) (10
mins)
Action items highlighted in lime green!
NEMO “to do” items• Identify "point person" at each site who will
be responsible for contributing feedback on NEMO wiki and ontologies and for uploading data and testing matlab-based tools for data markup– Please provide name & contact info for this person
in an email
• Bookmark NEMO website & explore links under “Collaboration” (more to come next time on how specifically you can contribute)
Overview Agenda
• ICBO highlights (5 mins)• Logistics (5 mins)• ERP pattern analysis methods (20 mins)• ERP measure generation (10 mins)• Linking measures to ontology (10 mins)• Data annotation (deep, ontology-based) (10
mins)
Action items highlighted in lime green!
ERP Pattern Analysis • An embarrassment of riches
– A wealth of data– A plethora of methods
• A lack of integration– How to compare patterns across studies, labs?– How to do valid meta-analyses in ERP research?
• A need for robust pattern classification– Bottom-up (data-driven) methods– Top-down (science-driven) methods
Ontologies for high-level, explicit
representation of domain knowledge
theoretical integration
Ontologies to support principled mark-up of data
(inc. ERP patterns)practical integration
NEMO principles that inform our pattern analysis strategies
• Current Challenges (motivations)– Tracking what we know
• Ontologies– Integrating knowledge to achieve high-level
understanding of brain–functional mappings • Meta-analyses
• Important Considerations (disiderata)– Stay true to data
• bottom-up (data-driven methods)– Achieve high-level understanding
• top-down (hypothesis-driven methods)
Top-down vs. Bottom-up
Top-Down Bottom-Up
PROS •Familiar•Science-driven (integrative)
•Formalized•Data-driven (robust)
CONS •Informal•Paradigm-affirming?
•Unfamiliar•Study-specific results?
Combining Top-Down & Bottom-Up
TO
P-
DO
WN
Traditional approach to bio-ontology dev’t
Encode knowledge of concepts (=> classes, relations, & axioms that involve classes & relations) in a formal ontology (e.g., owl/rdf)
NEMO owl ontologies being developed & version-tracked on Sourceforge(the main topic of our last meeting)
TO
P-
DO
WN
NEMO top-down approach
NEMO emphasis on pattern rules/descriptions — way to enforce rigorous definitionsOf complex concepts (patterns or “components”) that are central to ERP research
Superposition of ERP Patterns
What do we know about ERP patterns? Observed Pattern = “P100” iff
Event type is visual stimulus AND Peak latency is between 70 and 160 ms AND Scalp region of interest (ROI) is occipital AND Polarity over ROI is positive (>0)
FUNCTION TIME SPACE
?
Why does it matter?
Robust pattern rules a good foundation for–
Development of ERP ontologies Labeling of ERP data based on pattern rules Cross-experiment, cross-lab meta-analyses
BO
TT
OM
-U
P
Two classes of methods for ERP pattern analysis
• Pattern decomposition– Temporal factor analysis (tPCA, tICA)– Spatial factor analysis (sPCA, sICA
• Windowing/segmentation– Microstate analysis (use global field “maps”;
compute “global field dissimilarity” between adjacent maps to determine where there are significant shifts in topography
Focus today (already implemented & almost
ready for YOU to test )
Decomposition approach
PCA, ICA, dipoles etc.
multiple methods for principled separation of patterns using factor-analytic approach
P100
N100
fP2
P1r/ N3
P1r/ MFN
P300
100ms
170ms
200ms
280ms
400ms
600ms
Windowing/segmentation approach
P100
N100
fP2
P1r/ N3
P1r/ MFN
P300
100ms
170ms
200ms
280ms
400ms
600ms
Michel, et al., 2004; Koenig, 1995; Lehmann & Skrandies, 1985
Advantages over factor-analytic/ decomposition methods:• Familiarity — Closer to what most ERP researchers do (manually)• Less (or at least different!) concerns regarding misallocation of variance• Robustness to latency diffs across subjects, conditions
What we’ve done (to date…)• Implemented sPCA, tPCA, sICA, & microstate
analysis
• Tested & evaluated sPCA, tPCA & sICA (following Dien, Khoe, & Mangun, 2008) using simulated ERP data
• Explored two different approaches to pattern classification & labeling (the step AFTER decomposition)
1. Data preprocessing
1. filter & segment data
2. detect & reject artifacts
3. interpolate bad channels
4. average across trials w/in subjects
5. manual detection of bad channels
6. interpolate bad channels
7. re-reference montage (PARE)
8. baseline-correct (200ms)
2. Component AnalysisOur current practice (NOT set in stone!)
- Step 1. Apply eigenvalue decomposition method (eg., tPCA)
- Step 2: Rotate ALL latent factors (unrestricted PCA)
- Step 3: Retain fairly large number of factors based on log of scree
- Step 4: Let ontology-based labeling (next slide) help determine which factors to keep and analyze!
3. Component Labeling
NEXT MAJOR CHALLENGE: How to tune pattern rules (particularly TI-max begin and end) to fit each individual dataset. Data mining on results from different component analyses? (Note mining of tPCA data won’t help to refine temporal criteria.)
4. Meta-analysis (next milestone!!)
• Apply pattern decomposition & labeling to NEMO consortium datasets
• Identify one experimental contrast for each analysis• Compute Effect Size (ES) estimates for each study• Run mixed effects analysis:
• test homogeneity of variance across studies• if rejected, then test effects of variables that differ
across studies, laboratories (e.g., nature of stimuli, task, subjects)
ERP Meta-analysis goals
1. Demonstrate working NEMO consortium2. Demonstrate application of BrainMap-like taxonomy for
classification of functional (experimental) contrasts.3. Show that ERP component analysis, measure generation, and
component labeling tools can be used on a large scale 4. ** Show that combination of bottom-up and top-down methods for
refining pattern rules can be used to tune rules for detecting target ERP patterns across different datasets
5. ** Show that we can (semi-)automatically indentify analogous patterns across datasets (follows from 4), enabling us to carry out statistical meta-analyses
** harder problems to discuss…
A Case Study with real data(CIN’07 paper)
1. Real 128-channel ERP data2. Temporal PCA used for pattern analysis3. Spatial & temporal metrics for labeling of
discrete patterns4. Revision of pattern rules based on mining of
labeled data
Example: Rule for “P100”
•For any n, FAn = PT1 iff– temp criterion #1: 70ms > TI-max (FAn) < 170ms AND– spat criterion #1 : SP-r (FAn, SP(PT1)) > .7 AND– func criterion #1: EVENT (FAn) = stimon AND– func criterion #2: MODAL (EV) = visual AND
Example of output [1]
values for summary measures (for one subject, one/six expt conditions)
Example of output [2]
Matches to spatial, temporal & functional criteria for one subject & one/six experimental conditions
Summary results for Rule #1
A Case Study with simulated ERPs(HBM’08 tak)
1. Simulated ERP datasets2. PCA & ICA methods for spatial & temporal
pattern analysis3. Spatial & temporal metrics for labeling of
discrete patterns4. Revision of pattern rules based on mining of
labeled data
Simulated ERPs (n=80)
P100
N100
N3
MFN
P300 +NOISE
Simulated ERP Datasets (in DipSim)
Dipole Simulator (P. Berg)
1
2
3
4
5
Patrick Berg’s Dipole Simulator
Simulated ERP data: Creating individual ERPs
Source # ROI Intensity (uv / ma)
Latency (ms)
Location Theta
Location Phi
Orientation Theta
Orientation Phi Eccentricity
1 (P1) L-Occipital 3.5 | 45 050 : 150 -090.00o 068.20o -090.00o 053.62o 0.81 2 (P1) R-Occipital 4.0 | 45 055 : 155 090.00o -068.20o 090.00o -060.39o 0.81 3 (N1) L-Parietal -5.0 | -70 120 : 240 -100.02o 045.00o -090.00o 036.44o 0.57 4 (N1) R-Parietal -4.0 | -70 130 : 250 100.02o -045.00o 090.00o -036.44o 0.57 5 (N1N2) L-Temporal -4.0 | -60 160 : 300 -110.59o 035.72o -129.57o 019.93o 0.42 6 (N1N2) R-Temporal -2.0 | -60 170 : 310 114.00o -033.23o 125.30o -026.57o 0.40 7 (P2) Medial-Frontal 2.5 | -30 210 : 390 056.59o 087.82o -122.09o 083.111 o 0.63
• Random jitter in intensity• NO temporal jitter• NO spatial jitter
BO
TT
OM
-U
P
Pattern Analysis with PCA & ICA(Decomposition approach)
ERP pattern analysis• Temporal PCA (tPCA)
– Gives invariant temporal patterns (new bases)– Spatial variability as input to data mining
• Spatial ICA (sICA)– Gives invariant spatial patterns (new bases)– Temporal variability as input to data mining
• Spatial PCA (sPCA)
✔
✔
Multiple measures used for evaluation (correlation + L1/L2 norms)
X
New inputs to NEMO
PATTERN DEFINITIONS(Revised)
“P100” 1. 70 ms < TI-max ≤ 140 ms2. ROI = Occipital3. IN-mean (ROI) > 0
“N100” 1. 141 ms < TI-max ≤ 220 ms2. ROI = Occipital3. IN-mean (ROI) < 0
“N3c” 1. 221 ms < TI-max ≤ 260 ms2. ROI = Anterior Temporal3. IN-mean (ROI) < 0
“MFN” 1. 261 ms < TI-max ≤ 400 ms2. ROI = Mid Frontal3. IN-mean (ROI) < 0
“P300” 1. 401 ms < TI-max ≤ 600 ms2. ROI = Parietal3. IN-mean (ROI) > 0
SPATIAL TEMPORAL
What we’ve learned (so far…)
• Bottom-up methods result in validation & refinement of top-down pattern rules Validation of expert selection of temporal
concepts (peak latency) Refinement of expert specification of
spatial concepts (± centroids)
• Alternative pattern analysis methods (e.g., tPCA & sICA) provide complementary input to bottom-up (data mining) procedures
BO
TT
OM
-U
P
Measure Generation
T1 T2 S1 S2
Vector attributes = Input to Data mining (clustering & classification)
CoP
CoN
ROI ± Centroids
Input to data mining: 32 attribute vectors, defined over 80 “individual” ERPs (observations)
BO
TT
OM
-U
P
Data mining• Vectors of spatial & temporal attributes as input • Clustering observations patterns (E-M accuracy >97%)• Attribute selection (“Information gain”)
CoP
CoN
✔
± Centroids
Peak Latency
Revised Rule for the “P100” Pattern = P100v iff
Event type is visual stimulus AND Peak latency is between 76 and 155 ms AND Positive centroid is right occipital AND Negative centroid is left frontal
SPACE TIME FUNCTION
Simulated ERP Patterns“P100” “N100” “N3” “MFN” “P300”
Alternative Spatial Metrics
• Scalp (ROI) “regions-of-intrest”
• Positive and negative “centroids” (topographic source & sink)
CPOS
CNEG
Overview Agenda
• ICBO highlights (5 mins)• Logistics (5 mins)• ERP pattern analysis methods (20 mins)• ERP measure generation (10 mins)• Linking measures to ontology (10 mins)• Data annotation (deep, ontology-based) (10
mins)
Action items highlighted in lime green!
BO
TT
OM
-U
P
Statistical Measure Generation
• Temporal– Peak latency– Duration (cf. spectral measures)
• Spatial (topographic)– Scalp regions-of-Interest (ROI)– Positive & negative centroids
• Functional (experimental)– Concepts borrowed from BrainMap (Laird et al.)
where possible
Measure Generation
T1 T2 S1 S2
Vector attributes = Input to Data mining (clustering & classification)
CoP
CoN
ROI ± Centroids
Input to data mining: 32 attribute vectors, defined over 80 “individual” ERPs (observations)
Overview Agenda
• ICBO highlights (5 mins)• Logistics (5 mins)• ERP pattern analysis methods (20 mins)• ERP measure generation (10 mins)• Linking measures to ontology (10 mins)• Data annotation (deep, ontology-based) (10
mins)
Action items highlighted in lime green!
Automated ontology-based labeling of ERP data
Pattern Labels
Functional attributes
Temporal attributes
Spatial attributes
= + +
Robert M. Frank
Concepts encoded in NEMO_Data.owl
NEMO Data Ontology:Where ontology meets epistemology
Ontology for Biological Investigations (OBI)
&Information Artifact
Ontology (IAO)
Overview Agenda
• ICBO highlights (5 mins)• Logistics (5 mins)• ERP pattern analysis methods (20 mins)• ERP measure generation (10 mins)• Linking measures to ontology (10 mins)• Data annotation (deep, ontology-based) (10
mins)
Action items highlighted in lime green!