Automated Explanation of Gene-Gene Relationships Wacek Kuśnierczyk.
-
Upload
rachel-nelson -
Category
Documents
-
view
221 -
download
0
Transcript of Automated Explanation of Gene-Gene Relationships Wacek Kuśnierczyk.
Automated Explanation of Gene-Gene RelationshipsWacek Kuśnierczyk
slide 2
The Motivation
• In various biological studies researchers often come up with a list of (possibly related) genes
• If the relations between these genes are unknownor hypothetic, they have to be confirmed either experimentally or through a database search (or both)
• Manual browsing or searching is a very tedious task; any interpretation of the results requires expert knowledge
slide 3
The Goal
To automate the search in order to– assist a biologist in forming explanations of
actual and hypothetical relationships between sets of genes
– using • various types and sources of data, and• various similarity assessment tools, and• background (domain) knowledge
slide 4
The Field
The most important participating disciplines
BiologyComputerScience
Bioinformatics
slide 5
The Biologist’s Problem
Given a collection of genes, how can we explain the relationships between them, using the available data and knowledge?– How does gene g1 regulate (activate, inhibit)
gene g2?
– What is the functional similarity of gene g3 to gene g4?
– What is the metabolic (signalling) pathway common to gene g5 and g6 in the context of disease d1?
slide 6
The Bioinformatician’s ProblemGiven a collection of (biological) objects,
which of their properties can we compare and how, and where can we find their values?– Where do we find the gene sequence (protein
structure) data?– How do we assess the similarity between two
gene sequences (protein structures)?– Where do we find the suitable tools, how do we
use them and how do we interpret the results?
slide 7
The Computer Scientist’s ProblemGiven a collection of distributed data and
tools to link them, how do we build an explanatory path between objects from a query?
A search problem:– separate, partially overlapping graphs – coloured nodes – coloured, weighted, dynamic edges
slide 8
Simplified Search Space
Graph with homogeneous vertices and edges
Task: find (shortest) paths
slide 9
More Realistic Search Space
Graph with qualitatively different vertices, qualitatively different edges weighted with qualitatively different weights
Task: find (plausible) paths
slide 10
Even More Realistic Search Space
Each node is connected to a multitude of other nodes; combinatorial explosion – an exhaustive search unfeasible
Task: find heuristics to guide the search (generic and specific)
...
slide 11
A Trivial Example
CCK
GAS
Query
Input query
slide 12
A Trivial Example
PeptideHormone
CCK
GAS
BackgroundQuery
Initial mapping
slide 13
A Trivial Example
PeptideHormone
CCK
GAS
Hormone
Receptor
Acts on
Is a
BackgroundQuery
Activation spreading
slide 14
A Trivial Example
PeptideHormone
CCK
GAS
Hormone
Receptor
Acts on
Acts on
Is a
BackgroundQuery
Plausible inheritance (inference)
slide 15
A Trivial Example
PeptideHormone
ExtracellularReceptor
CCK
GAS
Hormone
Receptor
Acts on
Acts on
Is a
Is a
BackgroundQuery
Activation spreading
slide 16
A Trivial Example
PeptideHormone
ExtracellularReceptor
CCK
GAS
CCK A-R
GAS-R
Acts on
Acts on
Hormone
Receptor
Acts on
Acts on
Is a
Is a
BackgroundQuery
Data
Data retrieval and mapping
slide 17
A Trivial Example
PeptideHormone
ExtracellularReceptor
Acts on
CCK
GAS
CCK A-R
GAS-R
Acts on
Acts on
Hormone
Receptor
Acts on
Acts on
Is a
Is a
BackgroundQuery
Data
Induction
slide 18
A Trivial Example
PeptideHormone
ExtracellularReceptor
Receptor Family
Acts on
CCK
GAS
CCK A-R
GAS-R
Acts on
Acts on
Hormone
Receptor
Acts on
Acts on
Is a
Is aBe
longs
to
BackgroundQuery
Data
Activation spreading
slide 19
A Trivial Example
PeptideHormone
ExtracellularReceptor
Receptor Family
Acts on
CCK
GAS
CCK A-R
GAS-R
Acts on
Acts on
Belongs to
Hormone
Receptor
Acts on
Acts on
Is a
Is aBe
longs
to
BackgroundQuery
Data
Plausible inheritance
slide 20
A Trivial Example
PeptideHormone
ExtracellularReceptor
Receptor Family
Acts on
Belongs to
CCK
GAS
CCK A-R
GAS-R
CCK-R
Acts on
Acts on
Belong
s to
Belongs to
Hormone
Receptor
Acts on
Acts on
Is a
Is aBe
longs
to
BackgroundQuery
Data
Data retrieval and mapping
Formulation of an explanation
slide 21
Explanation Schema
Query
mapModel
Data
chain
parse
retrievematch
Explanation
slide 22
System Architecture
QI EI
GDKCB
DIDIT
QI EI
GDKCB
CR
DIDIT
HG/U
QI EI
GDKCB
DIDIDB DIDIT
QI EI
GDKCB
CR
DIDIDBDIDIDBDIDI
case basecore reasonerdatabasesexplanation interfacegeneral domain knowledgehypothesis generator, userquery interfacetools
CB:CR:DB:EI:GDK:HG/U:QI:T:
DIDITDB
slide 23
Related Work
Basic research in gastric cancerGenomic & proteomic datawarehouseSyntactic & semantic database integrationNatural language understandingKnowledge representation & modellingKnowledge intensive reasoning and learning
slide 24
Concerns
Is it reasonable?(what do biologists say)
Is it possible?(what do bioinformaticians say)
Is it feasible?(what do computer scientists say)
Isn’t it too ambitious (for a PhD study)?
??
slide 25
Disclaimer
An in silico solution is actually a hypothesis that requires physical (experimental) confirmation.
!!
slide 26
Acknowledgments
Agnar Aamodt, IDI.IME (AI, ML, CBR)Astrid Lægreid, IKM.DMF (biology,
bioinformatics)Arne Sandvik, IKM.DMF (medicine)Frode Sørmo, IDI.IME (Creek)