Evolutionary Symbolic Discovery for Bioinformatics, Systems and Synthetic Biology

Post on 29-Jun-2015

779 views 1 download

Tags:

description

I gave this talk at the Symbolic Regression Workshop that took place in GECCO 2010

Transcript of Evolutionary Symbolic Discovery for Bioinformatics, Systems and Synthetic Biology

Evolutionary Symbolic Discovery for Bioinformatics,

Systems and Synthetic Biology

Evolutionary Symbolic Discovery for Bioinformatics,

Systems and Synthetic Biology

P. Widera, J. Bacardit, N. Krasnogor, C. Garcia-Martinez, M. Lozano

2

• Symbolic Regression and Modeling are tightly linked in Bioinformatics, Systems and Synthetic Biology.

• We explore two problems:1.Synthesis of effective energy functions for PSP2.Synthesis of effective Systems/Synthetic Biology

models• Not run-of-the-mill Symbolic Regression, however:

1. a symbolic solution is sought2. must fit available data3.must be human understandable

3

Synthesis of effective energy functions for PSPSynthesis of effective energy functions for PSP

4

5

6

7

Synthesis of effective Systems/Synthetic Biology models

Synthesis of effective Systems/Synthetic Biology models

• P systems are a executable modeling framework that closely mimic biological reality.

• Can be seen as programs that explicitly mimic the internal behavior of cell systems.

• Cells (and most biologists) don’t do differential calculus!

Motivation

• Learning a program with stochastic behavior vs. learning a P system.

• A cell is a living example of distributed computing.

function f1(p1,p2,p3,p4){if (p1<p2) and (rand<0.5)

print p3else

print p4}

function f1(p1,p2,p3,p4){if (p1<p2)

RNDprint p3

RNDelse

RNDprint p4

RND}

P Systems

• Stochastic P systems are designed for specifying and simulating cellular systems.

• Defined by the tuple:

• O is the alphabet of molecules.• L={l1,…,ln} is the set of labels representing

different compartments.• μ is the membrane structure with n≥1

membranes.

Π=(O,L,μ,M l1,...,M ln

,Rl1 ,...,Rln )

P Systems

• Mli, 1 ≤ I ≤ n, is the initial configuration of

membrane i, i.e., multiset of objects over O placed inside the compartment of membrane li.

• is a finite set of rules associated with compartment li. These rules are expressed as:

1. where o1, o2, o’1, o’2 are multisets of objects over O and l ε L is a compartment label.

Rl i = {r1li ,...,rkli

li },1 ≤ i ≤ n,

o1[o2]lc ⏐ → ⏐ o'1[o'2 ]l

Modular Assembly of P Systems• Modules: set of rules representing molecular interactions

that occur often.• Elemental modules: Degradation, complexation,

unregulated gene expression, negative gene expression, etc.

• Combination of basic modules (building-blocks) originates more complex modules, allowing modular and hierarchical modeling with P systems.

• Challenge: Explore the large combinatorial space of modules and corresponding parameters.

Experimental Setup• Compare different evolutionary algorithms to find

structure & optimise parameters (kinetic constants) in P systems.

• Four test cases of increasing difficulty and dimension:1. TC1: Pulse generator for different initial conditions (13

parameters).2. TC2: Same problem as TC1 but with larger parameter ranges.3. TC3: More general pulse generator: feed-forward loop motif

(18 parameters).4. TC4: Bandwidth detector (34 parameters).

• Unclear which fitness function to use

Target Models

Average Model Fit

• Test Case 1

• Test Case 2

Average Model Fit

• Test Case 3

For protein1, all algorithms have similar output to the target.

Average Model Fit

• Test Case 4

18

Discussions & Conclusions• Design of effective fitness functions:

1.in both problems there is a lack of “silver bullet” fitness function. What to do?

2. Besides the obviouts “fit”, include, robustness, sensitivity, parsimony and “semantic fit” terms

• What space is being searched?• CPU hungry problems:

1. partial evaluations? lazy evaluation?2. Grid/Cloud/GPGPU

• • Human Understandability & Plausibility

19 /136

AcknowledgementsAcknowledgementsAcknowledgementsAcknowledgements

•Jonathan Blake

•Claudio Lima

•Francisco Romero-Campero

•Karima Righetti

•Jamie Twycross

Integrated Environment

Machine Learning & Optimisation

Modeling & Model Checking

Molecular Micro-Biology

Stochastic Simulations

Members of my team working on SB2

EP/E017215/1EP/E017215/1

EP/H024905/1EP/H024905/1

BB/F01855X/1BB/F01855X/1

BB/D019613/1BB/D019613/1

University of Nottingham

Prof. M. Camara, Dr. S. Heeb, Dr. G. Rampioni, Prof. P. Williams

Weizmann Institute of Science

Prof. D. Lancet, Prof. I. Pilpel

This Workshop Organisers

You for listening!

20 /136

Any Questions?Any Questions?Any Questions?Any Questions?

www.synbiont.orgBecome a member and have access to a largeinternational community of Synthetic Biologists