DataCure › openrisknet › events › ... · structures Generates a dataset that can be used for...
Transcript of DataCure › openrisknet › events › ... · structures Generates a dataset that can be used for...
![Page 1: DataCure › openrisknet › events › ... · structures Generates a dataset that can be used for machine learning ... ' application/json Content - Type. requests . post(url, data—payload,](https://reader033.fdocuments.in/reader033/viewer/2022042310/5ed87ce286e3a10d342b8ce2/html5/thumbnails/1.jpg)
www.openrisknet.org
OpenRiskNet: Open e-Infrastructure to Support Data Sharing, Knowledge Integration and in silico Analysis and Modelling in Risk Assessment
Project Number 731075
DataCureData curation and creation of pre-reasoned datasets
and searching
Noffisat Oki, Tim Dudgeon, Marc Jacobs, Danyel Jennen, Thomas Exner
![Page 2: DataCure › openrisknet › events › ... · structures Generates a dataset that can be used for machine learning ... ' application/json Content - Type. requests . post(url, data—payload,](https://reader033.fdocuments.in/reader033/viewer/2022042310/5ed87ce286e3a10d342b8ce2/html5/thumbnails/2.jpg)
www.openrisknet.org
Case Study objective
Data curation and merging Text mining
![Page 3: DataCure › openrisknet › events › ... · structures Generates a dataset that can be used for machine learning ... ' application/json Content - Type. requests . post(url, data—payload,](https://reader033.fdocuments.in/reader033/viewer/2022042310/5ed87ce286e3a10d342b8ce2/html5/thumbnails/3.jpg)
www.openrisknet.org
CypP450 data curation with Squonk
● Merge multiple datasets from ChEMBL into single set
● Uses ChEMBL identifiers to identify common structures
● Generates a dataset that can be used for machine learning
● See on GitHub
![Page 4: DataCure › openrisknet › events › ... · structures Generates a dataset that can be used for machine learning ... ' application/json Content - Type. requests . post(url, data—payload,](https://reader033.fdocuments.in/reader033/viewer/2022042310/5ed87ce286e3a10d342b8ce2/html5/thumbnails/4.jpg)
www.openrisknet.org
Data merging via data APIs
![Page 5: DataCure › openrisknet › events › ... · structures Generates a dataset that can be used for machine learning ... ' application/json Content - Type. requests . post(url, data—payload,](https://reader033.fdocuments.in/reader033/viewer/2022042310/5ed87ce286e3a10d342b8ce2/html5/thumbnails/5.jpg)
www.openrisknet.org
OpenAPI + JSON-LD
Subject or object
Predicate
![Page 6: DataCure › openrisknet › events › ... · structures Generates a dataset that can be used for machine learning ... ' application/json Content - Type. requests . post(url, data—payload,](https://reader033.fdocuments.in/reader033/viewer/2022042310/5ed87ce286e3a10d342b8ce2/html5/thumbnails/6.jpg)
www.openrisknet.org
Finding datasets
![Page 7: DataCure › openrisknet › events › ... · structures Generates a dataset that can be used for machine learning ... ' application/json Content - Type. requests . post(url, data—payload,](https://reader033.fdocuments.in/reader033/viewer/2022042310/5ed87ce286e3a10d342b8ce2/html5/thumbnails/7.jpg)
www.openrisknet.org
Text mining••
![Page 9: DataCure › openrisknet › events › ... · structures Generates a dataset that can be used for machine learning ... ' application/json Content - Type. requests . post(url, data—payload,](https://reader033.fdocuments.in/reader033/viewer/2022042310/5ed87ce286e3a10d342b8ce2/html5/thumbnails/9.jpg)
www.openrisknet.org
OpenRiskNet example workflowTask:● Identify the concept of
acetaminophen (definition, identifiers, synonyms)
● Find all relevant documents in the context of acetaminophen and carcinogenity
● What are the most relevant statements
Technology:● Semantic index of PubMed/PMC (> 20
terminologies)● Solr index + OLS index + UIMA
pipeline