The Power of Graphs to Analyze Biological Data
-
Upload
datablend -
Category
Technology
-
view
4.293 -
download
4
description
Transcript of The Power of Graphs to Analyze Biological Data
![Page 1: The Power of Graphs to Analyze Biological Data](https://reader038.fdocuments.in/reader038/viewer/2022110121/558c3eb1d8b42abe338b45e7/html5/thumbnails/1.jpg)
Grap
hCon
nect
![Page 2: The Power of Graphs to Analyze Biological Data](https://reader038.fdocuments.in/reader038/viewer/2022110121/558c3eb1d8b42abe338b45e7/html5/thumbnails/2.jpg)
the power of graphs to analyze biological data
![Page 3: The Power of Graphs to Analyze Biological Data](https://reader038.fdocuments.in/reader038/viewer/2022110121/558c3eb1d8b42abe338b45e7/html5/thumbnails/3.jpg)
about me
who am i ...
Davy Suvee@DSUVEE
➡ big data architect @ datablend - continuum• provide big data and nosql consultancy
• 5 years of hands-on expertise in the pharma/biotech sector
![Page 4: The Power of Graphs to Analyze Biological Data](https://reader038.fdocuments.in/reader038/viewer/2022110121/558c3eb1d8b42abe338b45e7/html5/thumbnails/4.jpg)
massive data
big data in pharma
full genome sequencing
complex databiological networks
scalable number crunching platform
visual insights-driven platform
graphs!!
![Page 5: The Power of Graphs to Analyze Biological Data](https://reader038.fdocuments.in/reader038/viewer/2022110121/558c3eb1d8b42abe338b45e7/html5/thumbnails/5.jpg)
outlier detection platform
big data in pharma (2 specific use cases)
neo4j, mongodb/cassandra and gephi
euretos - brainneo4j, mongodb, solr and prefuse
![Page 6: The Power of Graphs to Analyze Biological Data](https://reader038.fdocuments.in/reader038/viewer/2022110121/558c3eb1d8b42abe338b45e7/html5/thumbnails/6.jpg)
gene expression clustering
★ 4.800 samples★ 27.000 genes
➡ oncology data set:
➡ Question:★ for a particular subset of samples, which genes are co-expressed?
![Page 7: The Power of Graphs to Analyze Biological Data](https://reader038.fdocuments.in/reader038/viewer/2022110121/558c3eb1d8b42abe338b45e7/html5/thumbnails/7.jpg)
storing gene expressions (mongodb)
{ "_id" : { "$oid" : "4f1fb64a1695629dd9d916e3"} , "sample_name" : "122551hp133a21.cel" , "genomics_id" : 122551 , "sample_id" : 343981 , "donor_id" : 143981 , "sample_type" : "Tissue" , "sample_site" : "Ascending colon" , "pathology_category" : "MALIGNANT" , "pathology_morphology" : "Adenocarcinoma" , "pathology_type" : "Primary malignant neoplasm of colon" , "primary_site" : "Colon" , "expressions" : [ { "gene" : "X1_at" , "expression" : 5.54217719084415} , { "gene" : "X10_at" , "expression" : 3.92335121981739} , { "gene" : "X100_at" , "expression" : 7.81638155662255} , { "gene" : "X1000_at" , "expression" : 5.44318512260619} , … ]}
![Page 8: The Power of Graphs to Analyze Biological Data](https://reader038.fdocuments.in/reader038/viewer/2022110121/558c3eb1d8b42abe338b45e7/html5/thumbnails/8.jpg)
correlating samples (mongodb/map-reduce)
pearson correlation
x y
43 99
21 65
25 79
42 75
57 87
59 81
0,52
![Page 9: The Power of Graphs to Analyze Biological Data](https://reader038.fdocuments.in/reader038/viewer/2022110121/558c3eb1d8b42abe338b45e7/html5/thumbnails/9.jpg)
co-expression graph (neo4j)
➡ create a node for each sample➡ if correlation between two samples >= 0.8
create an edge between both nodes
122552
122553
122551
correlated
value : 0,86
![Page 10: The Power of Graphs to Analyze Biological Data](https://reader038.fdocuments.in/reader038/viewer/2022110121/558c3eb1d8b42abe338b45e7/html5/thumbnails/10.jpg)
co-expression visualisation (gephi)
![Page 11: The Power of Graphs to Analyze Biological Data](https://reader038.fdocuments.in/reader038/viewer/2022110121/558c3eb1d8b42abe338b45e7/html5/thumbnails/11.jpg)
euretos - brain
➡ pubmed: 23 million biomedical articles• 1300 new ones added every day• google-like search interface
➡ reading an article ...• malaria is transferred by mosquitoes
![Page 12: The Power of Graphs to Analyze Biological Data](https://reader038.fdocuments.in/reader038/viewer/2022110121/558c3eb1d8b42abe338b45e7/html5/thumbnails/12.jpg)
euretos - brain
authors references
![Page 13: The Power of Graphs to Analyze Biological Data](https://reader038.fdocuments.in/reader038/viewer/2022110121/558c3eb1d8b42abe338b45e7/html5/thumbnails/13.jpg)
euretos - brain
ooooooh crap ...
![Page 14: The Power of Graphs to Analyze Biological Data](https://reader038.fdocuments.in/reader038/viewer/2022110121/558c3eb1d8b42abe338b45e7/html5/thumbnails/14.jpg)
euretos - brain
➡ nanopub (nanopub.org)• the smallest unit of publishable information
➡ assertion• subject: malaria• predicate: transferred by• object: mosquito
➡ provenance• how this came to be (meta-data)
![Page 15: The Power of Graphs to Analyze Biological Data](https://reader038.fdocuments.in/reader038/viewer/2022110121/558c3eb1d8b42abe338b45e7/html5/thumbnails/15.jpg)
euretos - brain➡ unfortunately, malaria is encoded in various ways ...
malaria P22384 AQ879
db1 db2 db3
malaria
![Page 16: The Power of Graphs to Analyze Biological Data](https://reader038.fdocuments.in/reader038/viewer/2022110121/558c3eb1d8b42abe338b45e7/html5/thumbnails/16.jpg)
euretos - brain
malaria mosquitotransferred by
![Page 17: The Power of Graphs to Analyze Biological Data](https://reader038.fdocuments.in/reader038/viewer/2022110121/558c3eb1d8b42abe338b45e7/html5/thumbnails/17.jpg)
euretos - brain
➡ brain (http://www.euretos.com/brain)• exploration and analysis platform• millions of concepts/triples/nanopubs• pubmed, uniprot, omim, pubchem, ...
➡ architectural stack• meta-data is stored in mongodb• graph in neo4j• swing interface connecting to rest endpoints
![Page 18: The Power of Graphs to Analyze Biological Data](https://reader038.fdocuments.in/reader038/viewer/2022110121/558c3eb1d8b42abe338b45e7/html5/thumbnails/18.jpg)
brain
![Page 19: The Power of Graphs to Analyze Biological Data](https://reader038.fdocuments.in/reader038/viewer/2022110121/558c3eb1d8b42abe338b45e7/html5/thumbnails/19.jpg)
brain
![Page 20: The Power of Graphs to Analyze Biological Data](https://reader038.fdocuments.in/reader038/viewer/2022110121/558c3eb1d8b42abe338b45e7/html5/thumbnails/20.jpg)
brain
![Page 21: The Power of Graphs to Analyze Biological Data](https://reader038.fdocuments.in/reader038/viewer/2022110121/558c3eb1d8b42abe338b45e7/html5/thumbnails/21.jpg)
brain
![Page 22: The Power of Graphs to Analyze Biological Data](https://reader038.fdocuments.in/reader038/viewer/2022110121/558c3eb1d8b42abe338b45e7/html5/thumbnails/22.jpg)
brain
![Page 23: The Power of Graphs to Analyze Biological Data](https://reader038.fdocuments.in/reader038/viewer/2022110121/558c3eb1d8b42abe338b45e7/html5/thumbnails/23.jpg)
brain
![Page 24: The Power of Graphs to Analyze Biological Data](https://reader038.fdocuments.in/reader038/viewer/2022110121/558c3eb1d8b42abe338b45e7/html5/thumbnails/24.jpg)
brain
![Page 25: The Power of Graphs to Analyze Biological Data](https://reader038.fdocuments.in/reader038/viewer/2022110121/558c3eb1d8b42abe338b45e7/html5/thumbnails/25.jpg)
brain
![Page 26: The Power of Graphs to Analyze Biological Data](https://reader038.fdocuments.in/reader038/viewer/2022110121/558c3eb1d8b42abe338b45e7/html5/thumbnails/26.jpg)
Questions?
![Page 27: The Power of Graphs to Analyze Biological Data](https://reader038.fdocuments.in/reader038/viewer/2022110121/558c3eb1d8b42abe338b45e7/html5/thumbnails/27.jpg)
Follow us
twitter.com/data_blendwww.datablend.be
www.datablend.be [email protected] 0499/05.00.89
datablend - continuum