Towards an understanding of diversity in biological and biomedical systems
-
Upload
cursongs -
Category
Technology
-
view
855 -
download
1
description
Transcript of Towards an understanding of diversity in biological and biomedical systems
Data analysis workshop for massive
sequencing data
Igor Zwir
Department Computer Science and Artificial Intelligence,
University of Granada, Granada, Spain
Howard Hughes Medical Institute
Yale School of Medicine, NewHeaven, CT, US
Department of Psychiatry Washington University School of
Medicine, St. Louis, MO, US
e-mail: [email protected]
Towards an understanding of diversity in
biological and biomedical systems
“Some people enjoy reading papers,
juggling possibilities and formulating ideas,
even if they can’t work a pipette”
(“Reasoning for results”, Nature, Bray, D., 2001)
“Some people enjoy reading papers, juggling
possibilities and formulating ideas, even if
they can’t write a line of a computer program”
(“Reasoning for results”, Groisman Lab, 2007)
“…organisms of the most different sorts are
constructed from the very same battery of
genes. The diversity of life forms results from
small changes in the regulatory systems that
govern expression of these genes.”
François Jacob
In Of flies, mice and men
Salmonella : A Gram-negative
pathogen with a varied lifestyle
mgtB
Mg2+ transport
mgtA
Mg2+ transport
PhoQ
PhoP
low Mg2+
-PO3
Signal
Effectors
Regulator
Sensor
Response
Signal transduction cascade by
two-component regulatory systems
System Signal Function
ArcA/ArcB Quinones Anaerobic respiration
OmpR/EnvZ Osmolarity changes Osmoadaptation
NtrB/NtrC Low nitrogen levels Nitrogen metabolism
PhoP/PhoQ Low Mg2+ Virulence, growth in low Mg2+
PmrA/PmrB Fe3+ and Al3+ Resistance to polymyxin B
SsrA/SpiR Unknown Virulence
TtrR/TtrS Tetrathionate Anaerobic respiration
Two-component systems regulate physiological
and virulence functions
high Fe3+
pmrD PmrD
low Mg2+
PhoQ
PhoP -PO3
PmrB
-PO3PmrA
pbgP
LPS modification
The Salmonella PMRA/PMRB system
responds to Fe3+ and low Mg2+
high Fe3+
pmrD PmrD
low Mg2+
PhoQ
PhoP -PO3
PmrB
-PO3PmrA
pbgP
LPS modification85.4% 93.3%
The E. Coli PMRA/PMRB system
responds to Fe3+ but not to low Mg2+
PhoQ
PhoP -PO3
85.4% 93.3%
(the median amino acid identity between Salmonella and E. coli proteins is 90%)
ugd
85.5%
The Salmonella but not the E. coli ugd gene is
regulated by the PhoP protein
PhoQ
PhoP -PO3
ugd
PhoP-PhoQ Two component system
regulates 5% of Salmonella genes
Consensus Motif
Salmonella LT2 & E. coli K12
Single motif vs. a family of PhoP
submotifs
Harari et al., PloS computational Biology, 2010
+Sensitivity
+Specificity+Specificity
26 BS
PhoP submotifs improve BS detection
Genome wide analysis: custom tiling
arrays and ChIP assays
Evolution of submotives thougout the
Gamma/Enterobacteria
Perez et al., PloS Genetics, 2009; Harari et al., PloS computational Biology, 2010
S01 S05
Information content
Background (HKY85 Model)
PhoP (Halpem Bruno)
The submotifs and the PhoP protein evolve at
correlated rates
In vitro affinities correlate well with the top three
families of submotifs
Zwir et al., PNAS, 2005; Zwir et al, Bioinformatics, 2005,
Harari et al., BMC Bioinformatics, 2009
+
-
Submotif & distances from the
RNAP binding site
45%
Close Medium Remote
21%
Harari et al., PloS computational Biology, 2010
Two closely related species show
distinct promoter’s preferences
Close Medium Remote
Submotifs & distances can distinguish
Salmonella & E. coli
Two far related species show distinct
promoter architectures
PhoP-activated genes are bound and
transcribed at different times and levels
an
ce
str
al
ho
rizo
nta
lly
-ac
qu
ired
Predicting gene binding and transcription of
PhoP regulated targets
Summary
TF Affinity for its binding sites determine promoter
time and levels in naked DNA
Binding and Transcription in vivo depends on where
the binding sites sit (promoter architectures)
Cis-acting features in the PhoP-activated promoters
determine non-arbitrary organized architectures
The differences of the regulon througout distinct
species depends on the evolution of the binding sites
and promoter architectures
Two paradigms: multiple genes with small
effect, or few genes with large effect
London Metro Boston Metro
de Vries, Nature Medicine, 2009
Phenotypic-genotypic relations describe a risk
surface of Schizophrenia
0.1% of the population affected
Multigenic disease
Non-genetic contributions
Risk: Monozygotic twins 50% - Dizygotic twins 15%.
Gottesman II, Gould TD. Am J Psychiatry, 2003
R10:
11 affected,
6 Relatives
R19:
6 affected,
1 Relative
Trios (affected, relatives and controls)
70 clinical attributes
Cognitive
Motor
Behavioral
Structural
SNPs chips
Phenotype clusters
Genotype clusters
Su
bje
cts
Su
bje
cts
Uncovering genotype-phenotype relations by
independently clustering both domains
1E-10
0.01
Identifying significant genotype-phenotype
relations among inter-domain clusters
Romero-Zaliz et al, Nucleic Acids Research, 2008; Romero-Zaliz. et al, IEEE Trans. on
Evol. Computation, 2008, de Erausquin et al, Mol. Psych in Press
Phenotype relations
=~
Genotype relations
Optimal (multiobjective/multimodal) relations
are hierarchically organized
First degree relatives have
a genetic predisposition
Relations reflect the risk of Schizophrenia
Relation Risk(%) Affected Relative Control
R22 91 10164
10170
R19 88 10155
10192
R05 61 10184
R06 57 10156
R11 32 10181
R30 28 20148
10127
R29 17 10198 10158
10165
R24 9 10193 10151
10166
R25 1 10157
Validation using an independent set of
subjects
Pathway analysis Process for Neurological Disease
......
...
Qualitative significance of learned SNPs
Neuronal cell adhesion pathway derived from
the genotype domain of the relations
Novel pathways: oxidative stress and
epigenetic control of gene expression
Summary
We proposed the first data-driven definition of the Schizophrenia risk
function
Concurrent CGWAS provides a panoramic vision of phenotype-
genotype associations, each of which can be used by traditional
GWAS analysis
Four signaling pathways associated with risk of schizophrenia were
identified
Phenotype-genotype relations were sufficient to reliably predict
subject status
This finding opens the door for early detection and preventative
intervention prior to the onset of psychotic symptoms in
high/intermediate risk populations
Acknowledgements
Eduardo Groisman Lab
Howard Hughes Medical Institute
Dongwoo Shin
Chistian Perez
Henry Huang Lab
Dept. of Molecular Microbiology
Washington U.
School of Medicine, USA
Gabriel de Erausquin Lab
Departments of Psychiatry and
Neurology
Harvard Med. School
Dept. of Computer Science and
Artificial Intelligence
University of Granada, Spain
Coral del Val
Pat Anders
Javier Arnedo
Luis Miguel Merino
Rocio Romero-Zaliz (U. de Granada)
Cristina Rubio-Escudero (U. Seville)
Christopher Previti (U. Bergen)
Oscar Harari (Washington U.)
Acknowledgments
Francisco Herrera
Coral del Val
Igor Zwir
Mining for Modeling Lab
Kathleen MarchalDepartment of Microbial
and Molecular Systems
Katholieke Universiteit Leuven
Department of Psychiatry,
Washington University in St. Louis
Gabriel de Eraúsquin
Department of Molecular Biology,
Washington University in St. Louis
Henry Huang
DECSAI,
University of Granada
DECSAI,
University of Granada
DECSAI,
University of Granada
DECSAI,
University of Granada
HHMI, Department of Molecular Biology,
Washington University in St. Louis
Eduardo Groisman