myEquivalents, aka a new cross-reference service

Post on 11-Jul-2015

78 views 2 download

Tags:

Transcript of myEquivalents, aka a new cross-reference service

1

myEquivalentsmyEquivalentsaka, Cross Reference Serviceaka, Cross Reference Service

Marco Brandizi, EBI, 14 Feb 2013Marco Brandizi, EBI, 14 Feb 2013(Image Source: http://stackoverflow.com/questions/13340232/pythagoras-tree-to-windy-tree)(Image Source: http://stackoverflow.com/questions/13340232/pythagoras-tree-to-windy-tree)

2It's not rocket science...

(Image Source: http://www.chinapage.com/space/moon/orbiter.html)(Image Source: http://www.chinapage.com/space/moon/orbiter.html)

3… yet...

(Image Source: (Image Source: oh com'n! oh com'n! 2001, A Space Odissey) 2001, A Space Odissey)

4Rationale

References: AE ENA

References: BioSD ENA

References: BioSD AE

5Rationale

References: AE ENA

References: BioSD AE

References: BioSD ENA

6So, it's about equivalence relations

7Why a Centralised ServiceBioSD

SamplesSAMEA597705

BioSDSamples

SAMEA597705

AEExperimentsE-AFMX-11

http://www.ebi.ac.uk/arrayexpress/experiments/E-AFMX-11

AEExperimentsE-AFMX-11

http://www.ebi.ac.uk/arrayexpress/experiments/E-AFMX-11

AEData

E-AFMX-11http://www.ebi.ac.uk/arrayexpress/files/E-AFMX-11

AEData

E-AFMX-11http://www.ebi.ac.uk/arrayexpress/files/E-AFMX-11ENA

SequencesSRR034107

ENASequences

SRR034107 Bundle 1Bundle 1Bundle 1Bundle 1

http://dbpedia.org/resource/Barak_h_obamahttp://dbpedia.org/resource/Barak_h_obama

http://en.wikipedia.org/wiki/Barack_Obamahttp://en.wikipedia.org/wiki/Barack_Obama

http://www.freebase.com/view/en/barack_obamahttp://www.freebase.com/view/en/barack_obama

Bundle 2Bundle 2Bundle 2Bundle 2

Managing equivalenceManaging equivalenceclasses is more compactclasses is more compact

and more efficientand more efficient

Managing equivalenceManaging equivalenceclasses is more compactclasses is more compact

and more efficientand more efficient

8

Why a Centralised Service

Simplifies managementURI auto-creationLinks updated independently on their consumers and once only

Avoids redundancyimplicit symmetry and transitivity in the bundlessingle-point storage and rendering vs one per repository

More efficientA specialised service for this is potentially faster, e.g. sameas.org

More features can be added to the basic serviceMultiple access formats and paradigms (e.g., XML, RDF, SPARQL)MIRIAM integration

Simplifies managementURI auto-creationLinks updated independently on their consumers and once only

Avoids redundancyimplicit symmetry and transitivity in the bundlessingle-point storage and rendering vs one per repository

More efficientA specialised service for this is potentially faster, e.g. sameas.org

More features can be added to the basic serviceMultiple access formats and paradigms (e.g., XML, RDF, SPARQL)MIRIAM integration

9The Model

BioSD/SamplesSAMEA597705BioSD/SamplesSAMEA597705

AE/ExperimentsE-AFMX-11

AE/ExperimentsE-AFMX-11

AE/DataE-AFMX-11

AE/DataE-AFMX-11

ENA/SequencesSRR034107

ENA/SequencesSRR034107

ServiceAccessionService

AccessionEntityEntity

Entity MappingEntity Mapping

BioSDBioSD ENAENA AEAE

Service collectionsame accessions,implicit mapping

Service collectionsame accessions,implicit mapping

Bundle(i.e., partition class)

Bundle(i.e., partition class)

provides service

providesservice

providesservice

RepositoriesRepositories

Service Properties:Title, DescriptionURI Pattern

Repository Properties:Title, DescriptionURLManaging OrganizationLogo URL

10API Examples (Java, Mapping)public interface EntityMappingManager { public void storeMappings ( String ... entityIds ); public void storeMappingBundle ( String ... entityIds ); public int deleteMappings ( String ... entityIds ); public int deleteEntities ( String ... entityIds ); public EntityMappingSearchResult getMappings ( Boolean wantRawResult, String ... entityIds ); public EntityMappingSearchResult getMappingsForTarget ( Boolean wantRawResult, String targetServiceName, String entityId ); public String getMappingsAs ( String outputFormat, Boolean wantRawResult, String ... entityIds ); public String getMappingsForTargetAs ( String outputFormat, Boolean wantRawResult, String targetServiceName, String entityId ); public void close ();}

public interface EntityMappingManager { public void storeMappings ( String ... entityIds ); public void storeMappingBundle ( String ... entityIds ); public int deleteMappings ( String ... entityIds ); public int deleteEntities ( String ... entityIds ); public EntityMappingSearchResult getMappings ( Boolean wantRawResult, String ... entityIds ); public EntityMappingSearchResult getMappingsForTarget ( Boolean wantRawResult, String targetServiceName, String entityId ); public String getMappingsAs ( String outputFormat, Boolean wantRawResult, String ... entityIds ); public String getMappingsForTargetAs ( String outputFormat, Boolean wantRawResult, String targetServiceName, String entityId ); public void close ();}

Multiple access meansProgrammatic APILine CommandsREST Web Service

Multiple data exchange formatsJava and Java REST (Jersey used, client available)XML (The same that comes from REST, mapped via JAXB)JSON (future, maybe)RDF (future, more later)

Queries via service+accession or URI (in future)

Multiple access meansProgrammatic APILine CommandsREST Web Service

Multiple data exchange formatsJava and Java REST (Jersey used, client available)XML (The same that comes from REST, mapped via JAXB)JSON (future, maybe)RDF (future, more later)

Queries via service+accession or URI (in future)

11API Examples (Java)

12API Examples (Web Service)

13Component-based Architecture

Components and their topology configured/instantiated via SpringEasy to build features like:

CachingLoggingLayered computations (e.g., add services in the same collection)Integration of 3-rd party systems (e.g., MIRIAM, more later)

Components and their topology configured/instantiated via SpringEasy to build features like:

CachingLoggingLayered computations (e.g., add services in the same collection)Integration of 3-rd party systems (e.g., MIRIAM, more later)

14

Related Work

myEquivalents inspired to thisDoes pretty much what we doWith a very similar internal modelBut for URIs onlyCode not availableOnly available as SAAS, no binary to deploy

myEquivalents inspired to thisDoes pretty much what we doWith a very similar internal modelBut for URIs onlyCode not availableOnly available as SAAS, no binary to deploy

15

Related Work

Pair model for URIs is a standardEquivalence-based model missingDual identification mechanism missing

Pair model for URIs is a standardEquivalence-based model missingDual identification mechanism missing

16

Future: RDF, SPARQL, Semantic WebDereferenceable URIs, with RDF output

Keeping support to the accession-based model tooSPARQL, with support for both:

?b a mye:Bundle; mye:has-entity ?e1, ?e2, e3 (equivalence class model).?entity1 owl:sameAs ?entity2 (mapping pair model)and for entity containers:

_:e1 mye:provided-by [ _:s1 a mye:Service dc:title 'BioSD' ]adding reasoning over service types could come easilye.g. sample-service is-a biomaterial-service

To be implemented with direct translation from Java objects to SPARQL (not just export), e.g., using ARQ in Jena

Support for inference directly in the object modelfaster than a generic reasoner

Support for SPARQL/UPDATE?Would allow for using an endpoint straight as back-end

Support to keyword-based search, as in sameas.orgRequires the addition of attributes (eg, title, description), nothing available at the

Dereferenceable URIs, with RDF outputKeeping support to the accession-based model too

SPARQL, with support for both:?b a mye:Bundle; mye:has-entity ?e1, ?e2, e3 (equivalence class model).?entity1 owl:sameAs ?entity2 (mapping pair model)and for entity containers:

_:e1 mye:provided-by [ _:s1 a mye:Service dc:title 'BioSD' ]adding reasoning over service types could come easilye.g. sample-service is-a biomaterial-service

To be implemented with direct translation from Java objects to SPARQL (not just export), e.g., using ARQ in Jena

Support for inference directly in the object modelfaster than a generic reasoner

Support for SPARQL/UPDATE?Would allow for using an endpoint straight as back-end

Support to keyword-based search, as in sameas.orgRequires the addition of attributes (eg, title, description), nothing available at the

17

Related Work

It is to manage entities that share accessionse.g., PubMed and CiteXplore

So, not enough for usBut would be great to integrate!

It is to manage entities that share accessionse.g., PubMed and CiteXplore

So, not enough for usBut would be great to integrate!

18Future: MIRAM and identifiers.org support

Services &Entities

Services &Entities

Service CollectionService Collection

19Future: MIRAM and identifiers.org supportService CollectionService Collection

ServicesServices

EntityEntity

20Combining MIRAM and myEquivalents

Uniprot P62158Uniprot P62158

MIR:001000234599080

MIR:001000234599080

http://www.ebi.ac.uk/citexplore/citationDetails.do?

dataSource=MED&externalId=4599080

http://www.ebi.ac.uk/citexplore/citationDetails.do?

dataSource=MED&externalId=4599080

HubMed4599080HubMed4599080

http://www.ncbi.nlm.nih.gov/protein/P62158http://www.ncbi.nlm.nih.gov/protein/P62158

Mappings Stored inmyEquivalents

Mappings Stored inmyEquivalents

Computed byMIRIAM

Computed byMIRIAM

Computed byMIRIAM

Computed byMIRIAM

Resources importedfrom MIRIAM

Resources importedfrom MIRIAM

21

Issues: Access Control (on-going)We assume:

updates are managed by just a few people, within the same organisation and collaborating teammost of data is publicly readable

except private entities (maybe)Implies a very simple model, users can have the roles of

reader, can only read public stuffthe only thing got by anonymous (i.e., un-authenticated user)

editor, can change all (mappings, service descriptions etc)admin, can administrate users and permissionsThough simple, it's a good base for managing provenance too

Authentication detailsall requests contains user + hash(password) travel via SSL/HTTPS and via POSTmakes it unnecessary to have complex mechanisms based on shared secret (eg, OAuth)

We assume:updates are managed by just a few people, within the same organisation and collaborating teammost of data is publicly readable

except private entities (maybe)Implies a very simple model, users can have the roles of

reader, can only read public stuffthe only thing got by anonymous (i.e., un-authenticated user)

editor, can change all (mappings, service descriptions etc)admin, can administrate users and permissionsThough simple, it's a good base for managing provenance too

Authentication detailsall requests contains user + hash(password) travel via SSL/HTTPS and via POSTmakes it unnecessary to have complex mechanisms based on shared secret (eg, OAuth)

22

Issues: Versioning (future?)That's been ignored so far

cause we're assuming one version ↔ one accession ↔ one URIand leaving versioning fun to the repositories

Must be addressed later

Possible scenario: Entities are identified by means of service + acc + versionNew version relations are added (has-version, is-prior-version, has-next-version)It is still one URI ↔ one entity at the level of a given version

URI pattern contains an additional placeholder for the ver.It's up to the myEquivalents clients to either:

omit the version (ie, last version is always assumed, even upon ver. increase)specify a given version (requires manual version update)

Possibly: keep history of all versions

That's been ignored so farcause we're assuming one version ↔ one accession ↔ one URIand leaving versioning fun to the repositories

Must be addressed later

Possible scenario: Entities are identified by means of service + acc + versionNew version relations are added (has-version, is-prior-version, has-next-version)It is still one URI ↔ one entity at the level of a given version

URI pattern contains an additional placeholder for the ver.It's up to the myEquivalents clients to either:

omit the version (ie, last version is always assumed, even upon ver. increase)specify a given version (requires manual version update)

Possibly: keep history of all versions

23

That'sThat'sall!all!

ThankThank

You!You!

Have a look at the code and the wiki (on-going work!):Have a look at the code and the wiki (on-going work!):

http://github.com/EBIBioSamples/myequivalentshttp://github.com/EBIBioSamples/myequivalents