Automated Explanation of Gene-Gene Relationships Wacek Kuśnierczyk.

26
Automated Explanation of Gene-Gene Relationships Wacek Kuśnierczyk

Transcript of Automated Explanation of Gene-Gene Relationships Wacek Kuśnierczyk.

Page 1: Automated Explanation of Gene-Gene Relationships Wacek Kuśnierczyk.

Automated Explanation of Gene-Gene RelationshipsWacek Kuśnierczyk

Page 2: Automated Explanation of Gene-Gene Relationships Wacek Kuśnierczyk.

slide 2

The Motivation

• In various biological studies researchers often come up with a list of (possibly related) genes

• If the relations between these genes are unknownor hypothetic, they have to be confirmed either experimentally or through a database search (or both)

• Manual browsing or searching is a very tedious task; any interpretation of the results requires expert knowledge

Page 3: Automated Explanation of Gene-Gene Relationships Wacek Kuśnierczyk.

slide 3

The Goal

To automate the search in order to– assist a biologist in forming explanations of

actual and hypothetical relationships between sets of genes

– using • various types and sources of data, and• various similarity assessment tools, and• background (domain) knowledge

Page 4: Automated Explanation of Gene-Gene Relationships Wacek Kuśnierczyk.

slide 4

The Field

The most important participating disciplines

BiologyComputerScience

Bioinformatics

Page 5: Automated Explanation of Gene-Gene Relationships Wacek Kuśnierczyk.

slide 5

The Biologist’s Problem

Given a collection of genes, how can we explain the relationships between them, using the available data and knowledge?– How does gene g1 regulate (activate, inhibit)

gene g2?

– What is the functional similarity of gene g3 to gene g4?

– What is the metabolic (signalling) pathway common to gene g5 and g6 in the context of disease d1?

Page 6: Automated Explanation of Gene-Gene Relationships Wacek Kuśnierczyk.

slide 6

The Bioinformatician’s ProblemGiven a collection of (biological) objects,

which of their properties can we compare and how, and where can we find their values?– Where do we find the gene sequence (protein

structure) data?– How do we assess the similarity between two

gene sequences (protein structures)?– Where do we find the suitable tools, how do we

use them and how do we interpret the results?

Page 7: Automated Explanation of Gene-Gene Relationships Wacek Kuśnierczyk.

slide 7

The Computer Scientist’s ProblemGiven a collection of distributed data and

tools to link them, how do we build an explanatory path between objects from a query?

A search problem:– separate, partially overlapping graphs – coloured nodes – coloured, weighted, dynamic edges

Page 8: Automated Explanation of Gene-Gene Relationships Wacek Kuśnierczyk.

slide 8

Simplified Search Space

Graph with homogeneous vertices and edges

Task: find (shortest) paths

Page 9: Automated Explanation of Gene-Gene Relationships Wacek Kuśnierczyk.

slide 9

More Realistic Search Space

Graph with qualitatively different vertices, qualitatively different edges weighted with qualitatively different weights

Task: find (plausible) paths

Page 10: Automated Explanation of Gene-Gene Relationships Wacek Kuśnierczyk.

slide 10

Even More Realistic Search Space

Each node is connected to a multitude of other nodes; combinatorial explosion – an exhaustive search unfeasible

Task: find heuristics to guide the search (generic and specific)

...

Page 11: Automated Explanation of Gene-Gene Relationships Wacek Kuśnierczyk.

slide 11

A Trivial Example

CCK

GAS

Query

Input query

Page 12: Automated Explanation of Gene-Gene Relationships Wacek Kuśnierczyk.

slide 12

A Trivial Example

PeptideHormone

CCK

GAS

BackgroundQuery

Initial mapping

Page 13: Automated Explanation of Gene-Gene Relationships Wacek Kuśnierczyk.

slide 13

A Trivial Example

PeptideHormone

CCK

GAS

Hormone

Receptor

Acts on

Is a

BackgroundQuery

Activation spreading

Page 14: Automated Explanation of Gene-Gene Relationships Wacek Kuśnierczyk.

slide 14

A Trivial Example

PeptideHormone

CCK

GAS

Hormone

Receptor

Acts on

Acts on

Is a

BackgroundQuery

Plausible inheritance (inference)

Page 15: Automated Explanation of Gene-Gene Relationships Wacek Kuśnierczyk.

slide 15

A Trivial Example

PeptideHormone

ExtracellularReceptor

CCK

GAS

Hormone

Receptor

Acts on

Acts on

Is a

Is a

BackgroundQuery

Activation spreading

Page 16: Automated Explanation of Gene-Gene Relationships Wacek Kuśnierczyk.

slide 16

A Trivial Example

PeptideHormone

ExtracellularReceptor

CCK

GAS

CCK A-R

GAS-R

Acts on

Acts on

Hormone

Receptor

Acts on

Acts on

Is a

Is a

BackgroundQuery

Data

Data retrieval and mapping

Page 17: Automated Explanation of Gene-Gene Relationships Wacek Kuśnierczyk.

slide 17

A Trivial Example

PeptideHormone

ExtracellularReceptor

Acts on

CCK

GAS

CCK A-R

GAS-R

Acts on

Acts on

Hormone

Receptor

Acts on

Acts on

Is a

Is a

BackgroundQuery

Data

Induction

Page 18: Automated Explanation of Gene-Gene Relationships Wacek Kuśnierczyk.

slide 18

A Trivial Example

PeptideHormone

ExtracellularReceptor

Receptor Family

Acts on

CCK

GAS

CCK A-R

GAS-R

Acts on

Acts on

Hormone

Receptor

Acts on

Acts on

Is a

Is aBe

longs

to

BackgroundQuery

Data

Activation spreading

Page 19: Automated Explanation of Gene-Gene Relationships Wacek Kuśnierczyk.

slide 19

A Trivial Example

PeptideHormone

ExtracellularReceptor

Receptor Family

Acts on

CCK

GAS

CCK A-R

GAS-R

Acts on

Acts on

Belongs to

Hormone

Receptor

Acts on

Acts on

Is a

Is aBe

longs

to

BackgroundQuery

Data

Plausible inheritance

Page 20: Automated Explanation of Gene-Gene Relationships Wacek Kuśnierczyk.

slide 20

A Trivial Example

PeptideHormone

ExtracellularReceptor

Receptor Family

Acts on

Belongs to

CCK

GAS

CCK A-R

GAS-R

CCK-R

Acts on

Acts on

Belong

s to

Belongs to

Hormone

Receptor

Acts on

Acts on

Is a

Is aBe

longs

to

BackgroundQuery

Data

Data retrieval and mapping

Formulation of an explanation

Page 21: Automated Explanation of Gene-Gene Relationships Wacek Kuśnierczyk.

slide 21

Explanation Schema

Query

mapModel

Data

chain

parse

retrievematch

Explanation

Page 22: Automated Explanation of Gene-Gene Relationships Wacek Kuśnierczyk.

slide 22

System Architecture

QI EI

GDKCB

DIDIT

QI EI

GDKCB

CR

DIDIT

HG/U

QI EI

GDKCB

DIDIDB DIDIT

QI EI

GDKCB

CR

DIDIDBDIDIDBDIDI

case basecore reasonerdatabasesexplanation interfacegeneral domain knowledgehypothesis generator, userquery interfacetools

CB:CR:DB:EI:GDK:HG/U:QI:T:

DIDITDB

Page 23: Automated Explanation of Gene-Gene Relationships Wacek Kuśnierczyk.

slide 23

Related Work

Basic research in gastric cancerGenomic & proteomic datawarehouseSyntactic & semantic database integrationNatural language understandingKnowledge representation & modellingKnowledge intensive reasoning and learning

Page 24: Automated Explanation of Gene-Gene Relationships Wacek Kuśnierczyk.

slide 24

Concerns

Is it reasonable?(what do biologists say)

Is it possible?(what do bioinformaticians say)

Is it feasible?(what do computer scientists say)

Isn’t it too ambitious (for a PhD study)?

??

Page 25: Automated Explanation of Gene-Gene Relationships Wacek Kuśnierczyk.

slide 25

Disclaimer

An in silico solution is actually a hypothesis that requires physical (experimental) confirmation.

!!

Page 26: Automated Explanation of Gene-Gene Relationships Wacek Kuśnierczyk.

slide 26

Acknowledgments

Agnar Aamodt, IDI.IME (AI, ML, CBR)Astrid Lægreid, IKM.DMF (biology,

bioinformatics)Arne Sandvik, IKM.DMF (medicine)Frode Sørmo, IDI.IME (Creek)