Service Composition for Biomedical Applications
-
Upload
pedro-lopes -
Category
Documents
-
view
220 -
download
0
description
Transcript of Service Composition for Biomedical Applications
doctoral programme in informatics engineering
october 1st, 2012
supervisor
José Luís Guimarães Oliveirauniversidade de aveiro
jury
Artur Manuel Soares da Silvauniversidade de aveiro
Víctor Maojo Garcíauniversidade politécnica de madrid
Rui Pedro Sanches de Castro Lopesescola superior de tecnologia e gestão
do instituto politécnico de bragança
Francisco José Moreira Coutouniversidade de lisboa
Carlos Manuel Azevedo Costauniversidade de aveiro
Pedro Lopes
FOR BIOMEDICAL APPLICATIONS
SERVICE COMPOSITION
FOR BIOMEDICAL APPLICATIONS
bioinformatics & computational biology
software engineering
SERVICE COMPOSITION
DECREASED DISEASE RISK
ELEVATED DISEASE RISK
DATA EVOLUTION
0
250
500
750
1000
1250
1500
2004 2005 2006 2007 2008 2009 2010 2011 2012
548
719
858968
10781170
12301330 1380
162 171 139 110 110 95 58 96 92
New Total
NAR database list evolution, from 2004 to 2012
DATA EVOLUTION
bioinformatics requirements
computer science
developments
more data
more tools
innovation
more data
innovation
DATA EVOLUTION
new software and hardware
bioinformatics requirements
more tools
more data
innovation
more tools
DATA EVOLUTION
DATA EVOLUTION
more data
more tools
innovation
NEW STRATEGIES TO IMPROVE AND ADD VALUE TO SERVICE COMPOSITION SCENARIOS
more datamore tools
innovation
workflow-based strategies for service composition
resource integration approaches for service composition
enhancing service composition with the semantic web and rad
NEW STRATEGIES TO IMPROVE AND ADD VALUE TO SERVICE COMPOSITION SCENARIOS
workflow-based strategies for service composition
resource integration approaches for service composition
enhancing service composition with the semantic web and rad
NEW STRATEGIES TO IMPROVE AND ADD VALUE TO SERVICE COMPOSITION SCENARIOS
SCIENTIFIC WORKFLOWS
SCIENTIFIC WORKFLOWS
SCIENTIFIC WORKFLOWS
data outresult analysisknowledge exchange
data inintegration perspective
distributed composition control
INTEROPERABILITY CHALLENGE
scientific workflowsworkflow & service interoperabilityparallelization, security, integration
previous workdynamicflowtaverna is the de facto workbench for scientific workflows
data sharingknowledge exchange semanticscollaborative and reproducible research
activity executioncombine and evaluate multiple independent activities
deliver service & workflow execution in real-time web-based environment
deploy a scalable architecture to enable service composition between multiple data & services providers ?
WORKFLOW COMPOSITION ARCHITECTURE
5CLIENT APPLICATIONS
4APPLICATION ENGINE
3DATA MANAGEMENT
2WORKFLOW ENGINE
1WORKFLOWS
DataCombination
Engine
WEBSERVICE
WEBSERVICE
WEBSERVICE
WorkflowExecutionEngine
Taverna
Hibernate
API
interaction xsdxml schema for service compositionnormalize service input and outputenable autonomous data exchanges
workflow enginenew java-based taverna workflow wrapperenable distributed service orchestration
open knowledge provider hub
WORKFLOW COMPOSITION ARCHITECTURE
5CLIENT APPLICATIONS
4APPLICATION ENGINE
3DATA MANAGEMENT
2WORKFLOW ENGINE
1WORKFLOWS
DataCombination
Engine
WEBSERVICE
WEBSERVICE
WEBSERVICE
WorkflowExecutionEngine
Taverna
Hibernate
API
HIGHLIGHTS
web-based workspacealways-available collaborative environment
custom on-demand data analysis
a new strategy for advanced service composition that enables collaborative & distributed research !
new interoperability standardcommunication language to enable automated data exchanges from multiple providers
ease the creation of distributed service composition workflows
distributed scientific workflowscomplex interactions between heterogeneous services
distributed signal substantiation tasks
workflow execution enginewrap taverna execution online
real-time activity processing
RESULTSEU-ADR WEB PLATFORMhttp://bioinformatics.ua.pt/euadr
RESULTSEU-ADR WEB PLATFORMhttp://bioinformatics.ua.pt/euadr
automatic filtering and substantiation of drug safety signals A Bauher-Mehren, EM van Mulligen, P Avillach, MC Carrascosa, B Singh, R Garcia-Serna, Pedro Lopes, José Luís Oliveira, G Diallo, J Mestres, E Ahlberg Helgee, S Boyer, F Sanz, JA Kors, LI Furlong
plos computational biologymarch 2012
the eu-adr web platform: delivering advanced pharmacovigilance toolsJosé Luís Oliveira, Pedro Lopes, Tiago Nunes, David Campos, S Boyer, E Ahlberg, EM Van Mulligen, JA Kors, B Singh, LI Furlong, F Sanz, A Bauer-Mehren, MC Carrascosa, J Mestres, P Avillach, G Diallo, C Diaz, J Van der Lei
pharmacoepidemiology and drug safety [second revision]october 2012
a bioinformatics hub for pharmacovigilance knowledge providersPedro Lopes, David Campos, Tiago Nunes, José Luís Oliveira
acm transactions on management information systems [ongoing]november 2012
workflow-based strategies for service composition
resource integration approaches for service composition
enhancing service composition with the semantic web and rad
NEW STRATEGIES TO IMPROVE AND ADD VALUE TO SERVICE COMPOSITION SCENARIOS
workflow-based strategies for service composition
resource integration approaches for service composition
enhancing service composition with the semantic web and rad
NEW STRATEGIES TO IMPROVE AND ADD VALUE TO SERVICE COMPOSITION SCENARIOS
INTEGRATION CHALLENGE
previous workscientific workflows ideal for service-service interactionsservice composition for resource integration
design a service composition strategy to enable agile integration of enriched human variome knowledge ?
genetic dataset aggregationextract data from distributed lsdbs
genotype-to-phenotype integrationenrich human variome data with connections to multiple data types
content accreditationpromote correct authorship, ownership and attribution
UNDERSTAND CHANGES IN OUR GENETIC SEQUENCE
INTEGRATION STRATEGY
LOVD
GENE LIST
LSDB LIST
APPLICATION ENGINE
UMD IDB Other
ARABELLAFEED READER
API
external datamiscellaneous heterogeneous sources
from gene ontology to protein databases
distributed lsdbscustom variant readers and web crawling
multiple non-standardized formats
deliver knowledgeweb application
variome api
INTEGRATION ARCHITECTURE
API
CSV XML SQL REST
5CLIENT APPLICATIONS
4APPLICATION ENGINE
3BUILD ENGINE
2INTEGRATION MIDDLEWARE
1CONFIGURATION
BUILD ENGINE
apirest api for external, enables service composition scenarios
extensible data modelcore (gene + variant) plus extensionslightweight link-based connections
advanced integration engineservice composition for intelligent lsdb data extractionvariation dataset enrichment
data gathering wrappersconfigurable resources
load data from csv, xml, sql and rest sources
flexible configurationsingle resource setup file
innovative service composition description schema
API
CSV XML SQL REST
5CLIENT APPLICATIONS
4APPLICATION ENGINE
3BUILD ENGINE
2INTEGRATION MIDDLEWARE
1CONFIGURATION
BUILD ENGINE
INTEGRATION ARCHITECTURE
HIGHLIGHTS
innovative uigene mesh and liveviewcontent accreditation
new service composition methods allow understanding, exploring and connecting human variome knowledge !
variation integrationservice composition approach to gather genetics dataset from multiple external resources
extensible modelsimplified description and addition of new external service composition actors
interoperabilityunique service composition variome apiaccess to curated collection of genetic variants
RESULTSWAVE: WEB ANALYSIS OF THE VARIOMEhttp://bioinformatics.ua.pt/wave
RESULTSWAVE: WEB ANALYSIS OF THE VARIOMEhttp://bioinformatics.ua.pt/wave
a holistic approach for integrating genomic variation informationPedro Lopes and José Luís Oliveira
10th spanish symposium on bioinformaticsmalaga, spain, october 2010
an extensible platform for variome data integration Pedro Lopes and José Luís Oliveira
10th ieee international conference on information technology and applications in biomedicinecorfu, greece, november 2010
wave: web analysis of the variome Pedro Lopes, Raymond Dalgleish and José Luís Oliveira
human mutationmarch 2011
workflow-based strategies for service composition
resource integration approaches for service composition
enhancing service composition with the semantic web and rad
NEW STRATEGIES TO IMPROVE AND ADD VALUE TO SERVICE COMPOSITION SCENARIOS
workflow-based strategies for service composition
resource integration approaches for service composition
enhancing service composition with the semantic web and rad
NEW STRATEGIES TO IMPROVE AND ADD VALUE TO SERVICE COMPOSITION SCENARIOS
NEXT-GENERATION BIOMEDICAL APPLICATIONS
the semantic web paradigmthe perfect solution for life sciences natural complexity
previous worknew service composition strategies for interoperability using scientific workflowsnew service composition strategies for resource integration
life sciencescomplex mesh of data and servicesmodern demands keep pushing computer science forward
rapid application developmenteasily-configurable service composition implementation environment
build the next generation of service composition applications faster
enhanced rapid application developmentstraightforward service composition setup process
design a new semantic web framework to streamline the creation of next generation biomedical applications ?
advanced data integration enginerich service composition description, flexible integration from heterogeneous resources
state-of-the-art interoperabilityapis enable service composition for everything and everyone
semantic knowledge managementinvolve semantic web technologies at all service composition layers
knowledge federationenable distributed access to knowledge
empower ecosystemsdelivery network for custom cross-platform and cross-device applications
NEXT-GENERATION BIOMEDICAL APPLICATIONS
FRAMEWORK MODEL
API
DATA INTEGRATION CONNECTORS
dc:title rdfs:labelowl:imports
KNOWLEDGE BASE
CSV XML SQL SPARQL
REST JAVA LDATA SPARQL
foaf:name
SINGLE INSTANCE
interoperabilitydata out
integrationdata in
KNOWLEDGE FEDERATION LAYER
API
DATA INTEGRATION CONNECTORS
KNOWLEDGE BASE
CSV XML SQL SPARQL
REST JAVA LDATA SPARQL
API
DATA INTEGRATION CONNECTORS
KNOWLEDGE BASE
CSV XML SQL SPARQL
REST JAVA LDATA SPARQL
API
DATA INTEGRATION CONNECTORS
dc:title rdfs:labelowl:imports
KNOWLEDGE BASE
CSV XML SQL SPARQL
REST JAVA LDATA SPARQL
foaf:name
FEDERATION
FRAMEWORK MODEL
FRAMEWORK ARCHITECTURE
JavaRESTpubby
LinkedData
JosekiSPARQL
ABSTRACTION ENGINE
6CLIENT APPLICATIONS
5API
4APPLICATION ENGINE
3KNOWLEDGE BASE
2INTEGRATION ENGINE
1EXTERNAL SOURCES
CSV SQLXML SPARQL
JavaRESTpubby
LinkedData
JosekiSPARQL
ABSTRACTION ENGINE
6CLIENT APPLICATIONS
5API
4APPLICATION ENGINE
3KNOWLEDGE BASE
2INTEGRATION ENGINE
1EXTERNAL SOURCES
CSV SQLXML SPARQL
future-proof interoperabilityrest services
sparql endpoint + linkeddata interfaces available by default
java + javascript libraries
streamlined application engine“semantic web in a box”
straightforward backend deployment with tomcat
semantic knowledge managementmysql-based triplestorejena-supported methods
advanced integration enginesemantic web translationnew extract-transform-load strategy
flexible service composition configurationcomprehensive connectors & selectorsload data from csv, xml, sql and sparql sources
modern client-side developmentready for any ui framework
FRAMEWORK ARCHITECTURE
rapid application developmentquickly build a new service composition application ecosystem
HIGHLIGHTS
a framework to empower the creation of next-generation service composition-based semantic software !
semantic data integration platformflexible acquisition and translation of data from heterogeneous resources
semantic web and linkeddata interoperabilityfuture-proof interoperability with the most innovative application paradigmsemantic reasoning and inference
federationrest, sparql and linkeddata apis
one query, multiple knowledge bases
RESULTSCOEUShttp://bioinformatics.ua.pt/coeus
RESULTSCOEUShttp://bioinformatics.ua.pt/coeus
towards knowledge federation in biomedical applications Pedro Lopes and José Luís Oliveira
7th international conference on semantic systemsgraz, austria, october 2011
a semantic web application framework for health systems interoperabilityPedro Lopes and José Luís Oliveira
international workshop on managing interoperability and complexity in health systemsglasgow, scotland, november 2011
coeus: “semantic web in a box” for biomedical applicationsPedro Lopes and José Luís Oliveira
journal of biomedical semantics [second revision]october 2012
coeus: a semantic web application framework Pedro Lopes and José Luís Oliveira
4th international semantic web applications & tools for life sciences workshoplondon, united kingdom, december 2012
workflow-based strategies for service composition
resource integration approaches for service composition
enhancing service composition with the semantic web and rad
NEW STRATEGIES TO IMPROVE AND ADD VALUE TO SERVICE COMPOSITION SCENARIOS
workflow-based strategies for service composition
resource integration approaches for service composition
enhancing service composition with the semantic web and rad
NEW STRATEGIES TO IMPROVE AND ADD VALUE TO SERVICE COMPOSITION SCENARIOS
coeus evaluation
LEGACY DISEASECARD
LEGACY DISEASECARD
evaluationperfect benchmark for a new coeus instance
primitive engineeringstatic integration/navigation protocolconstrained data model
rare diseases research portalcollection of pointers to disease pages
used for academia and clinical research
evolutionimperative to update scientific dataset
new features for users and developers
CREATING A NEW COEUS INSTANCE
deliver knowledgeuse apis to create advanced user interfacesuse apis to access data
configure service compositiondefine resource descriptions for connectorsspecify data selectors
setup new application modelre-use existing ontologies
improve omim basic model
build triplestoreautonomous process
pull data from configured resources into semantic knowledge base
# locus entity:entity_Locus :isEntityOf :concept_Ensembl, :concept_EntrezGene, :concept_GeneCards, :concept_HGNC, :concept_MapView, :concept_UCSC; :isIncludedIn :seed_Diseasecard4; dc:description "Collects Locus entity knowledge."^^xsd:string; dc:title "Locus"^^xsd:string; a :Entity, owl:NamedIndividual; rdfs:label "entity_Locus"^^xsd:string;
# hgnc concept:concept_HGNC :hasEntity :entity_Locus; :hasResource :resource_HGNC; :isExtendedBy :resource_ClinicalTrials, :resource_ENZYME, :resource_Ensembl, :resource_HGNC, :resource_KEGG, :resource_MedlinePlus, :resource_UniProt; dc:description "Concept relating HGNC data."^^xsd:string; dc:title "HGNC"^^xsd:string; a :Concept, owl:NamedIndividual; rdfs:label "concept_hgnc"^^xsd:string;
CREATING A NEW COEUS INSTANCE
deliver knowledgeuse apis to create advanced user interfacesuse apis to access data
configure service compositiondefine resource descriptions for connectorsspecify data selectors
setup new application modelre-use existing ontologies
improve omim basic model
build triplestoreautonomous process
pull data from configured resources into semantic knowledge base
# hgnc connector:resource_HGNC :endpoint "http://www.genenames.org/cgi-bin/hgnc_downloads.cgi?title=HGNC+output+data&col=gd_hgnc_id&col=gd_app_sym&col=gd_app_name&col=gd_pub_chrom_map&status=Approved&status_opt=1&level=pri&where=gd_app_sym+like+%27#replace#%27&order_by=gd_app_sym_sort&limit=&format=text&submit=submit&.cgifields=&.cgifields=level&.cgifields=chr&.cgifields=status&.cgifields=hgnc_dbtag"^^xsd:string; :extends :concept_HGNC; :extension "rdfs:label"^^xsd:string; :hasKey :csv_HGNC_id; :isResourceOf :concept_HGNC; :loadsFrom :csv_HGNC_id, :csv_HGNC_name; :method "complete"^^xsd:string; :order 3 ; dc:description "Resource connecting gene HGNC information."^^xsd:string; dc:publisher "csv"^^xsd:string; dc:title "HGNC"^^xsd:string; a :Resource, owl:NamedIndividual; rdfs:label "resource_hgnc"^^xsd:string;
# hgnc identifier selector:csv_HGNC_id :isKeyOf :resource_HGNC; :loadsFor :resource_HGNC; :property "dc:source|dc:identifier"^^xsd:string; :query "0"^^xsd:string; dc:description "Information for HGNC CSV resource loading: loads HGNC id."^^xsd:string; dc:title "HGNC_id"^^xsd:string; a :CSV, owl:NamedIndividual; rdfs:label "csv_hgnc_id"^^xsd:string;
# hgnc name selector :csv_HGNC_name :loadsFor :resource_HGNC; :property "rdfs:comment|dc:description"^^xsd:string; :query "2"^^xsd:string; dc:description "Information for HGNC CSV resource loading: loads HGNC name."^^xsd:string; dc:title "HGNC_name"^^xsd:string; a :CSV, owl:NamedIndividual; rdfs:label "csv_hgnc_name"^^xsd:string;
CREATING A NEW COEUS INSTANCE
deliver knowledgeuse apis to create advanced user interfacesuse apis to access data
configure service compositiondefine resource descriptions for connectorsspecify data selectors
setup new application modelre-use existing ontologies
improve omim basic model
build triplestoreautonomous process
pull data from configured resources into semantic knowledge base
0
GENE
HGNCOMIM
DISEASE
GENE
EntrezUniProt
PROTEIN
HPO
ONTOLOGY
DRUG
DrugBank
PDB
InterPro
PROTEIN
MeSH
ONTOLOGY
GENE
EnsemblProsite
PROTEIN
UMLS
ONTOLOGY
PharmGKB
DRUG
Pubmed
LITERATURE
3
2
1
CREATING A NEW COEUS INSTANCE
deliver knowledgeuse apis to create advanced user interfacesuse apis to access data
configure service compositiondefine resource descriptions for connectorsspecify data selectors
setup new application modelre-use existing ontologies
improve omim basic model
build triplestoreautonomous process
pull data from configured resources into semantic knowledge base
# javapt.ua.bioinformatics.API.getTriple(“coeus:hgnc_BRCA2”, ”p”, ”o”, “xml”);
# resthttp://bioinformatics.ua.pt/coeus/api/triple/coeus:hgnc_BRCA2/p/o/csv
# sparql federationPREFIX dc: <http://purl.org/dc/elements/1.1/>PREFIX diseasome: <http://www4.wiwiss.fu-berlin.de/diseasome/resource/diseasome/>PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>PREFIX coeus: <http://bioinformatics.ua.pt/coeus/>SELECT ?pdb ?meshWHERE {{ SERVICE <http://www4.wiwiss.fu-berlin.de/diseasome/sparql> { <http://www4.wiwiss.fu-berlin.de/diseasome/resource/genes/BRCA2> rdfs:label ?label } }{ SERVICE <http://bioinformatics.ua.pt/coeus/sparql> { _:gene dc:title ?label . _:gene coeus:isAssociatedTo ?uniprot } }{ SERVICE <http://bioinformatics.ua.pt/coeus/sparql> { ?uniprot coeus:isAssociatedTo ?pdb . ?pdb coeus:hasConcept coeus:concept_PDB } }{ SERVICE <http://bioinformatics.ua.pt/coeus/sparql> { ?uniprot coeus:isAssociatedTo ?mesh . ?mesh coeus:hasConcept coeus:concept_MeSH } }}
RESULTSTHE NEW DISEASECARDhttp://bioinformatics.ua.pt/dc4
RESULTSTHE NEW DISEASECARDhttp://bioinformatics.ua.pt/dc4
improved availabilityavailable to researchers through web applicationavailable to developers through default apis
easy setupsimplified resource integrationstraightforward client-side application creation
efficient developmentrapid application development at its best
reduced implementation effort compared to similar systems
workflow-based strategies for service composition
resource integration approaches for service composition
enhancing service composition with the semantic web and rad
NEW STRATEGIES TO IMPROVE AND ADD VALUE TO SERVICE COMPOSITION SCENARIOS
service compositionpioneering framework for enhanced semantic web-based service composition
next-generation strategies for integration and interoperability
interoperabilitynew strategies for workflow-based service composition
advanced methods to deliver knowledge
integrationinnovative integrative approach to describe service composition
flexible integration engine to compose heterogeneous resources
1
2
3
NEW STRATEGIES TO IMPROVE AND ADD VALUE TO SERVICE COMPOSITION SCENARIOS
CONCLUSIONS
future perspectives
NEW STRATEGIES TO IMPROVE AND ADD VALUE TO SERVICE COMPOSITION SCENARIOS
future perspectives
NEW STRATEGIES TO IMPROVE AND ADD VALUE TO SERVICE COMPOSITION SCENARIOS
FUTURE PERSPECTIVES
beyond service compositionsoftware-as-a-service use is increasing
streamlined and lightweight interactions are everywhere
linkeddata and the semantic websemantic web as the foundation for new software engineering strategies
linkeddata is a growing knowledge network
modern software platformsgrowing relevance of efficient content deliveryone knowledge base, multiple cross-platform & cross-device clients
worldwide knowledge networksmore sophisticated knowledge expression technologies
richer, meaningful data are more connected
business valueresearch and enterprise are intertwinedcoeus use goes beyond science
THANK YOU
doctoral programme in informatics engineering
october 1st, 2012
supervisor
José Luís Guimarães Oliveirauniversidade de aveiro
jury
Artur Manuel Soares da Silvauniversidade de aveiro
Víctor Maojo Garcíauniversidade politécnica de madrid
Rui Pedro Sanches de Castro Lopesescola superior de tecnologia e gestão
do instituto politécnico de bragança
Francisco José Moreira Coutouniversidade de lisboa
Carlos Manuel Azevedo Costauniversidade de aveiro
Pedro Lopes
FOR BIOMEDICAL APPLICATIONS
SERVICE COMPOSITION