Programming Languages for Programming Languages for BiologyBiology
Bor-Yuh Evan Chang
November 25, 2003
OSQ Group Meeting
11/25/2003 2
Biological PerspectiveBiological Perspective
F [http://www.nocturnalvisions.freeservers.com/page6.html]FF [Matsudaira et al. Molecular Cell Biology 4.0. Freeman, 2000]
F
FFFF
FF
11/25/2003 3
Traditional Biological ResearchTraditional Biological Research
• Experiments must focus on a small, specific piece of a system– isolate the variable– feasibility
• Have led to an enormous wealth of (detailed) knowledge but in a fragmented form
Cell Receptor ExpertVirus Expert
11/25/2003 4
Systems BiologySystems Biology
• Emerging area of biology– study of the relationships and interactions
between biological components– many thousand of molecules interact in
complex series of reactions to perform some function (called a pathway)• e.g., lactose interacting with a receptor
triggers a series of actions to create the enzyme capable of breaking it down into usable form
– “pathways” may overlap
11/25/2003 5
Approaching Systems BiologyApproaching Systems Biology
• Need a common language of describing/modeling all components of a system– must be modular, compositional, and provided
varying levels of abstraction
• AbstractionAbstraction is an absolute necessity– 1 ribosome (eukaryotic) ¼ 82 proteins + rRNA
• 1 protein ¼ hundreds/thousands amino acids
– 1 membrane ¼ thousands of molecules (lipids, proteins, carbohydrates)
11/25/2003 6
The Biologist’s ViewThe Biologist’s View
• How do biologists think about or view biological entities (e.g., proteins)?– an entity can interact with certain other types of
entities– an entity can be in a certain “state”– interaction causes some action or state change
• Analogous to a system of thousands of concurrent computational processescomputational processes– Walter Fontana, a theoretical biologist,
examined -calculus and linear logic for describing biological systems (¼1995).
11/25/2003 7
Example “Textbook” DescriptionExample “Textbook” Description
http://vcell.ndsu.nodak.edu/~christjo/vcell/animationSite/lacOperon/
11/25/2003 8
Our RoleOur Role
• Finding suitable abstractions for describing computation is our specialty!
• Discovering/proving/checking properties of such descriptions (i.e., programs) is also our specialty!
• Goal:– Find a mathematical abstraction convenient for
describing, reasoning, simulating biological systems• DNA ! string over the alphabet {A,C,G,T}
– enables the use of string comparison algorithms
• Cellular Pathways ! ?
11/25/2003 9
OutlineOutline
• Why PL is at all related to Biology?
• Previous Abstractions in Biology
• Possible Directions of Work
• PML
• Conclusion
11/25/2003 10
Previous AbstractionsPrevious Abstractions
• Chemical kinetic models
– can derive differential equations– well-studied, with considerable theoretical basis– variables do not directly correspond with
biological entities– may become difficult to see how multiple
equations relate to each other
11/25/2003 11
Previous AbstractionsPrevious Abstractions
• Pathway Databases (e.g., EcoCyc, KEGG)– store information in a symbolic form and provide ways
to query the database– behavior of biological entities not directly described
• Petri nets– directed bipartite multigraph (P,T,E) of places,
transitions, and edges; places contain tokens– place = molecular species, token = molecule, transition
= reaction
2
11/25/2003 12
Previous AbstractionsPrevious Abstractions
• Concurrent computational processes– each biological entity is a process that may
carry some state and interacts with other processes
– each process described by a “program”– prior proposals based on process algebras,
such as the -calculus [Regev et al. ’01]
11/25/2003 13
Possible Directions of WorkPossible Directions of Work
• Biologically-motivated “process calculi”– finding a suitable machine model to serve as a common
basis for describing biological systems– Cardelli, Danos, Laneve, …
• High-level languages– find suitable high-level languages to make descriptions
closer to informal ones– [Chang and Sridharan ’03]
• Program analyses, simulation, and other tools– simulation will likely be insufficient
• Creating models for obtaining results in biology
11/25/2003 14
OutlineOutline
• Why PL is at all related to Biology?
• Previous Abstractions in Biology
• Possible Directions of Work
• PML
• Conclusion
11/25/2003 15
Modeling in the Modeling in the -calculus-calculus
• The -calculus is concise and compact, yet powerful [Milner ’90]– take this as the underlying machine model– not looking for another machine model
• However, it is far too low-level for direct modeling (ad-hoc structuring)
11/25/2003 16
Informal Graphical DiagramsInformal Graphical Diagrams
Protein
Enzyme Protein Enzyme
Enzyme
Proteink
k-1
kcatsites
domains
rules
11/25/2003 17
PML: EnzymePML: Enzyme
Enzymebind_substrateparameterized
declared in outer scope
interactions within the complex
11/25/2003 18
PML: ProteinPML: Protein
Protein Proteinbind_substrate bind_product
11/25/2003 19
PML: A Simple SystemPML: A Simple System
11/25/2003 20
Larger ModelsLarger Models
• Modeled a general description of ER cotranslational-translocation– unclearly or incompletely specified aspects
became apparent• e.g., can the signal sequence and translocon
bind without SRP? Yes [Herskovits and Bibi ’00]
• Extended to model targeting ER membrane with minor modifications
11/25/2003 21
PML: SummaryPML: Summary• Domains
– set of mutually dependent binding sites– defines at the lowest-level the reactions a biological
entity can undergo
• Groups– static structure for controlling namespace– may represent a large biological entity
• large complex, a system, etc.
• [Compartments]– special groups that define boundaries
• Semantics defined via a translation to the -calculus
11/25/2003 22
PML: SummaryPML: Summary• Benefits
– easier to write and understand because of a more direct biological metaphor
– block structure for controlling namespace and modularity
• Future Work– naming?– proximity of molecules– integrating quantitative information (reaction rates, etc.)– type-checking PML specifications– exceptional / higher-level specifications– graphical and simulation tools
11/25/2003 23
ConclusionConclusion
• Systems biology needs a mathematical foundation– languages for describing concurrent computation seem
like a step in the right direction
• Status: all very preliminaryall very preliminary– biologically-motivated process calculi
• BioSPI, BioAmbients, Brane Calculus, …
– high-level languages• PML
– analyses and tools (emerging)– creating models for results in biology (emerging)
11/25/2003 24
ConclusionConclusion
• Abundance of new challenges for PL– language design: biologically-motivated
operators– analysis and simulation: dealing with the scale– …
• How much biology does one need to learn to begin?
Bonus SlidesBonus Slides
CompartmentsCompartments
11/25/2003 28
CompartmentsCompartments
• Critical part of biological pathways– prevents interactions that would otherwise
occur
• Description of the behavior of a molecule should not depend on the compartment
• Regev et al. use “private” channels in the -calculus for both complexing and compartmentalization
11/25/2003 29
PML: Simple Compartments ExamplePML: Simple Compartments Example
MolAMolB
bind_a bind_a
11/25/2003 30
PML: Simple Compartments ExamplePML: Simple Compartments Example
MolAMolB
ER Cytosol
CytERBridge
11/25/2003 31
PML: Simple Compartments ExamplePML: Simple Compartments Example
MolB
ER Cytosol
CytERBridge MolA
Semantics of PMLSemantics of PML
11/25/2003 33
Semantics of PMLSemantics of PML
• Defined in terms of the -calculus via two translations– from PML to CorePML
• “flattens” compartments, removes bridges
11/25/2003 34
Semantics of PMLSemantics of PML
– from CorePML to the -calculus
Syntax of PMLSyntax of PML
11/25/2003 36
Syntax of PMLSyntax of PML
11/25/2003 37
Syntax of PMLSyntax of PML
Example: Cotranslational Example: Cotranslational TranslocationTranslocation
11/25/2003 39
Example: Cotranslational TranslocationExample: Cotranslational Translocation
• Ribosome translates mRNA exposing a signal sequence
• Signal sequence attracts SRP stopping translation• SRP receptor (on ER membrane) attracts SRP• Signal sequence interacts with translocon, SRP
disassociates resuming translation• Signal peptidase cleaves the signal sequence in
the ER lumen, Hsc70 chaperones aid in protein folding
11/25/2003 40
Example: Cotranslational TranslocationExample: Cotranslational Translocation
11/25/2003 41
Example: Cotranslational TranslocationExample: Cotranslational Translocation
11/25/2003 42
Example: Cotranslational TranslocationExample: Cotranslational Translocation
11/25/2003 43
Example: Cotranslational TranslocationExample: Cotranslational Translocation
11/25/2003 44
Example: Cotranslational TranslocationExample: Cotranslational Translocation
11/25/2003 45
Example: Cotranslational TranslocationExample: Cotranslational Translocation
11/25/2003 46
Example: Cotranslational TranslocationExample: Cotranslational Translocation
Top Related