Interchange/Cross Reference - · PDF fileInterchange/Cross Reference EMPI
myEquivalents, aka a new cross-reference service
-
Upload
marco-brandizi -
Category
Software
-
view
78 -
download
2
Transcript of myEquivalents, aka a new cross-reference service
1
myEquivalentsmyEquivalentsaka, Cross Reference Serviceaka, Cross Reference Service
Marco Brandizi, EBI, 14 Feb 2013Marco Brandizi, EBI, 14 Feb 2013(Image Source: http://stackoverflow.com/questions/13340232/pythagoras-tree-to-windy-tree)(Image Source: http://stackoverflow.com/questions/13340232/pythagoras-tree-to-windy-tree)
2It's not rocket science...
(Image Source: http://www.chinapage.com/space/moon/orbiter.html)(Image Source: http://www.chinapage.com/space/moon/orbiter.html)
3… yet...
(Image Source: (Image Source: oh com'n! oh com'n! 2001, A Space Odissey) 2001, A Space Odissey)
4Rationale
References: AE ENA
References: BioSD ENA
References: BioSD AE
5Rationale
References: AE ENA
References: BioSD AE
References: BioSD ENA
6So, it's about equivalence relations
7Why a Centralised ServiceBioSD
SamplesSAMEA597705
BioSDSamples
SAMEA597705
AEExperimentsE-AFMX-11
http://www.ebi.ac.uk/arrayexpress/experiments/E-AFMX-11
AEExperimentsE-AFMX-11
http://www.ebi.ac.uk/arrayexpress/experiments/E-AFMX-11
AEData
E-AFMX-11http://www.ebi.ac.uk/arrayexpress/files/E-AFMX-11
AEData
E-AFMX-11http://www.ebi.ac.uk/arrayexpress/files/E-AFMX-11ENA
SequencesSRR034107
ENASequences
SRR034107 Bundle 1Bundle 1Bundle 1Bundle 1
http://dbpedia.org/resource/Barak_h_obamahttp://dbpedia.org/resource/Barak_h_obama
http://en.wikipedia.org/wiki/Barack_Obamahttp://en.wikipedia.org/wiki/Barack_Obama
http://www.freebase.com/view/en/barack_obamahttp://www.freebase.com/view/en/barack_obama
Bundle 2Bundle 2Bundle 2Bundle 2
Managing equivalenceManaging equivalenceclasses is more compactclasses is more compact
and more efficientand more efficient
Managing equivalenceManaging equivalenceclasses is more compactclasses is more compact
and more efficientand more efficient
8
Why a Centralised Service
Simplifies managementURI auto-creationLinks updated independently on their consumers and once only
Avoids redundancyimplicit symmetry and transitivity in the bundlessingle-point storage and rendering vs one per repository
More efficientA specialised service for this is potentially faster, e.g. sameas.org
More features can be added to the basic serviceMultiple access formats and paradigms (e.g., XML, RDF, SPARQL)MIRIAM integration
Simplifies managementURI auto-creationLinks updated independently on their consumers and once only
Avoids redundancyimplicit symmetry and transitivity in the bundlessingle-point storage and rendering vs one per repository
More efficientA specialised service for this is potentially faster, e.g. sameas.org
More features can be added to the basic serviceMultiple access formats and paradigms (e.g., XML, RDF, SPARQL)MIRIAM integration
9The Model
BioSD/SamplesSAMEA597705BioSD/SamplesSAMEA597705
AE/ExperimentsE-AFMX-11
AE/ExperimentsE-AFMX-11
AE/DataE-AFMX-11
AE/DataE-AFMX-11
ENA/SequencesSRR034107
ENA/SequencesSRR034107
ServiceAccessionService
AccessionEntityEntity
Entity MappingEntity Mapping
BioSDBioSD ENAENA AEAE
Service collectionsame accessions,implicit mapping
Service collectionsame accessions,implicit mapping
Bundle(i.e., partition class)
Bundle(i.e., partition class)
provides service
providesservice
providesservice
RepositoriesRepositories
Service Properties:Title, DescriptionURI Pattern
Repository Properties:Title, DescriptionURLManaging OrganizationLogo URL
10API Examples (Java, Mapping)public interface EntityMappingManager { public void storeMappings ( String ... entityIds ); public void storeMappingBundle ( String ... entityIds ); public int deleteMappings ( String ... entityIds ); public int deleteEntities ( String ... entityIds ); public EntityMappingSearchResult getMappings ( Boolean wantRawResult, String ... entityIds ); public EntityMappingSearchResult getMappingsForTarget ( Boolean wantRawResult, String targetServiceName, String entityId ); public String getMappingsAs ( String outputFormat, Boolean wantRawResult, String ... entityIds ); public String getMappingsForTargetAs ( String outputFormat, Boolean wantRawResult, String targetServiceName, String entityId ); public void close ();}
public interface EntityMappingManager { public void storeMappings ( String ... entityIds ); public void storeMappingBundle ( String ... entityIds ); public int deleteMappings ( String ... entityIds ); public int deleteEntities ( String ... entityIds ); public EntityMappingSearchResult getMappings ( Boolean wantRawResult, String ... entityIds ); public EntityMappingSearchResult getMappingsForTarget ( Boolean wantRawResult, String targetServiceName, String entityId ); public String getMappingsAs ( String outputFormat, Boolean wantRawResult, String ... entityIds ); public String getMappingsForTargetAs ( String outputFormat, Boolean wantRawResult, String targetServiceName, String entityId ); public void close ();}
Multiple access meansProgrammatic APILine CommandsREST Web Service
Multiple data exchange formatsJava and Java REST (Jersey used, client available)XML (The same that comes from REST, mapped via JAXB)JSON (future, maybe)RDF (future, more later)
Queries via service+accession or URI (in future)
Multiple access meansProgrammatic APILine CommandsREST Web Service
Multiple data exchange formatsJava and Java REST (Jersey used, client available)XML (The same that comes from REST, mapped via JAXB)JSON (future, maybe)RDF (future, more later)
Queries via service+accession or URI (in future)
11API Examples (Java)
12API Examples (Web Service)
13Component-based Architecture
Components and their topology configured/instantiated via SpringEasy to build features like:
CachingLoggingLayered computations (e.g., add services in the same collection)Integration of 3-rd party systems (e.g., MIRIAM, more later)
Components and their topology configured/instantiated via SpringEasy to build features like:
CachingLoggingLayered computations (e.g., add services in the same collection)Integration of 3-rd party systems (e.g., MIRIAM, more later)
14
Related Work
myEquivalents inspired to thisDoes pretty much what we doWith a very similar internal modelBut for URIs onlyCode not availableOnly available as SAAS, no binary to deploy
myEquivalents inspired to thisDoes pretty much what we doWith a very similar internal modelBut for URIs onlyCode not availableOnly available as SAAS, no binary to deploy
15
Related Work
Pair model for URIs is a standardEquivalence-based model missingDual identification mechanism missing
Pair model for URIs is a standardEquivalence-based model missingDual identification mechanism missing
16
Future: RDF, SPARQL, Semantic WebDereferenceable URIs, with RDF output
Keeping support to the accession-based model tooSPARQL, with support for both:
?b a mye:Bundle; mye:has-entity ?e1, ?e2, e3 (equivalence class model).?entity1 owl:sameAs ?entity2 (mapping pair model)and for entity containers:
_:e1 mye:provided-by [ _:s1 a mye:Service dc:title 'BioSD' ]adding reasoning over service types could come easilye.g. sample-service is-a biomaterial-service
To be implemented with direct translation from Java objects to SPARQL (not just export), e.g., using ARQ in Jena
Support for inference directly in the object modelfaster than a generic reasoner
Support for SPARQL/UPDATE?Would allow for using an endpoint straight as back-end
Support to keyword-based search, as in sameas.orgRequires the addition of attributes (eg, title, description), nothing available at the
Dereferenceable URIs, with RDF outputKeeping support to the accession-based model too
SPARQL, with support for both:?b a mye:Bundle; mye:has-entity ?e1, ?e2, e3 (equivalence class model).?entity1 owl:sameAs ?entity2 (mapping pair model)and for entity containers:
_:e1 mye:provided-by [ _:s1 a mye:Service dc:title 'BioSD' ]adding reasoning over service types could come easilye.g. sample-service is-a biomaterial-service
To be implemented with direct translation from Java objects to SPARQL (not just export), e.g., using ARQ in Jena
Support for inference directly in the object modelfaster than a generic reasoner
Support for SPARQL/UPDATE?Would allow for using an endpoint straight as back-end
Support to keyword-based search, as in sameas.orgRequires the addition of attributes (eg, title, description), nothing available at the
17
Related Work
It is to manage entities that share accessionse.g., PubMed and CiteXplore
So, not enough for usBut would be great to integrate!
It is to manage entities that share accessionse.g., PubMed and CiteXplore
So, not enough for usBut would be great to integrate!
18Future: MIRAM and identifiers.org support
Services &Entities
Services &Entities
Service CollectionService Collection
19Future: MIRAM and identifiers.org supportService CollectionService Collection
ServicesServices
EntityEntity
20Combining MIRAM and myEquivalents
Uniprot P62158Uniprot P62158
MIR:001000234599080
MIR:001000234599080
http://www.ebi.ac.uk/citexplore/citationDetails.do?
dataSource=MED&externalId=4599080
http://www.ebi.ac.uk/citexplore/citationDetails.do?
dataSource=MED&externalId=4599080
HubMed4599080HubMed4599080
http://www.ncbi.nlm.nih.gov/protein/P62158http://www.ncbi.nlm.nih.gov/protein/P62158
Mappings Stored inmyEquivalents
Mappings Stored inmyEquivalents
Computed byMIRIAM
Computed byMIRIAM
Computed byMIRIAM
Computed byMIRIAM
Resources importedfrom MIRIAM
Resources importedfrom MIRIAM
21
Issues: Access Control (on-going)We assume:
updates are managed by just a few people, within the same organisation and collaborating teammost of data is publicly readable
except private entities (maybe)Implies a very simple model, users can have the roles of
reader, can only read public stuffthe only thing got by anonymous (i.e., un-authenticated user)
editor, can change all (mappings, service descriptions etc)admin, can administrate users and permissionsThough simple, it's a good base for managing provenance too
Authentication detailsall requests contains user + hash(password) travel via SSL/HTTPS and via POSTmakes it unnecessary to have complex mechanisms based on shared secret (eg, OAuth)
We assume:updates are managed by just a few people, within the same organisation and collaborating teammost of data is publicly readable
except private entities (maybe)Implies a very simple model, users can have the roles of
reader, can only read public stuffthe only thing got by anonymous (i.e., un-authenticated user)
editor, can change all (mappings, service descriptions etc)admin, can administrate users and permissionsThough simple, it's a good base for managing provenance too
Authentication detailsall requests contains user + hash(password) travel via SSL/HTTPS and via POSTmakes it unnecessary to have complex mechanisms based on shared secret (eg, OAuth)
22
Issues: Versioning (future?)That's been ignored so far
cause we're assuming one version ↔ one accession ↔ one URIand leaving versioning fun to the repositories
Must be addressed later
Possible scenario: Entities are identified by means of service + acc + versionNew version relations are added (has-version, is-prior-version, has-next-version)It is still one URI ↔ one entity at the level of a given version
URI pattern contains an additional placeholder for the ver.It's up to the myEquivalents clients to either:
omit the version (ie, last version is always assumed, even upon ver. increase)specify a given version (requires manual version update)
Possibly: keep history of all versions
That's been ignored so farcause we're assuming one version ↔ one accession ↔ one URIand leaving versioning fun to the repositories
Must be addressed later
Possible scenario: Entities are identified by means of service + acc + versionNew version relations are added (has-version, is-prior-version, has-next-version)It is still one URI ↔ one entity at the level of a given version
URI pattern contains an additional placeholder for the ver.It's up to the myEquivalents clients to either:
omit the version (ie, last version is always assumed, even upon ver. increase)specify a given version (requires manual version update)
Possibly: keep history of all versions
23
That'sThat'sall!all!
ThankThank
You!You!
Have a look at the code and the wiki (on-going work!):Have a look at the code and the wiki (on-going work!):
http://github.com/EBIBioSamples/myequivalentshttp://github.com/EBIBioSamples/myequivalents