Network integration of heterogeneous data
-
Upload
lars-juhl-jensen -
Category
Technology
-
view
800 -
download
2
description
Transcript of Network integration of heterogeneous data
Network integration of heterogeneous data
Lars Juhl JensenEMBL Heidelberg
association networks
STRING
STITCH
373 genomes
model organism databases
Ensembl
Genome Reviews
RefSeq
genomic context methods
phylogenetic profiles
Cell
Cellulosomes
Cellulose
conserved neighborhood
operons
bidirectional promoters
gene fusion
primary experimental data
expression profiles
GEOGene Expression Omnibus
expression compendia
protein interactions
yeast two-hybrid
affinity purification
genetic interactions
synthetic lethality
BioGRIDGeneral Repository for Interaction Datasets
IntAct
MINTMolecular Interactions Database
DIPDatabase of Interacting Proteins
BINDBiomolecular Interaction Network Database
HPRDHuman Protein Reference Database
literature mining
co-mentioning
statistical methods
NLPNatural Language Processing
Gene and protein names
Cue words for entity recognition
Verbs for relation extraction
[nxexpr The expression of [nxgene the cytochrome genes [nxpg CYC1 and CYC7]]]is controlled by[nxpg HAP1]
MEDLINE
SGDSaccharomyces Genome Database
The Interactive Fly
OMIMOnline Mendelian Inheritance in Man
good synonyms list
manual curation
orthographic variation
disambiguation
curated knowledge
complexes
MIPSMunich Information center
for Protein Sequences
Gene Ontology
pathways
KEGGKyoto Encyclopedia of Genes and Genomes
Reactome
PIDNCI-Nature Pathway Interaction Database
STKESignal Transduction Knowledge Environment
variable reliability
raw quality scores
conservation
reproducibility
not comparable
benchmarking
calibrate vs. gold standard
probabilistic scores
combine all evidence
P = 1-(1-P1).(1-P2).(1-P3)…
spread over many species
transfer by orthology
two modes
COG mode
protein mode
signaling network
NetworKIN
NetPhorest
phosphoproteomics
mass spectrometry
in vivo phosphosites
kinases are unknown
computational methods
sequence motifs
kinase families
overprediction
context
localization
expression
co-activators
scaffolders
association networks
the idea
NetworKIN
coverage
69 kinases
benchmarking
small-scale validation
ATM phosphorylates Rad50
Cdk1 phosphorylates 53BP1
high-throughput validation
multiple reaction monitoring
the future
more sequence motifs
NetPhorest
data organization
selection
benchmarking
179 kinases
89 SH2 domains
8 PTB domains
upstream signaling
downstream signaling
signaling pathways
Acknowledgments
STRING & STITCH– Christian von Mering– Michael Kuhn– Manuel Stark– Samuel Chaffron– Philippe Julien– Tobias Doerks– Jan Korbel– Berend Snel– Martijn Huynen– Peer Bork
Literature mining– Evangelos Pafilis– Jasmin Saric– Rossitza Ouzounova– Sean O’Donoghue– Isabel Rojas
NetworKIN & NetPhorest– Rune Linding– Martin Lee Miller– Gerard Ostheimer– Francesca Diella– Karen Colwill– Jing Jin– Pavel Metalnikov– Vivian Nguyen– Adrian Pasculescu– Jin Gyoon Park– Leona D. Samson– Nikolaj Blom– Rob Russell– Peer Bork– Søren Brunak– Michael Yaffe– Tony Pawson
http://larsjuhljensen.wordpress.com