Download - Research and Education Space: what are we going to be tangled up with?

Transcript
Page 1: Research and Education Space: what are we going to be tangled up with?

Research and Education Space: what are we going to be tangled up with?

Chiara Del Vescovo & Alex Tucker

Page 2: Research and Education Space: what are we going to be tangled up with?

Collect | Connect | Create• Aimed at teachers, researchers, pupils

• Collect: make all the relevant resources available and discoverable

• Connect: find meaningful interrelations between these resources

• Create: knowledge and tools to make use of these resources

Page 3: Research and Education Space: what are we going to be tangled up with?

• A LOD platform to collect, index, and organise metadata about publicly-held archives (and more) !

!

!

!

!

• Challenges ahead! (some solved by making use of LOD)

Acropolis

Core Platform: “Acropolis”

Project RES: Technical Approach

1The crawler fetches data via HTTP from publishedsources. Once retrieved, it is indexed by the full-textstore and passed to the aggregation engine for evaluation.

2

The results of the aggregation engine's evaluation processare stored in the aggregate store, which contains minimalbrowse information and information about the similarity ofentities.

3

The public face of the core platform is an extremely basicbrowsing interface (which presents the data in tabular formto aid application developers), and read-write RESTful APIs.

4Applications may use the APIs to locate information aboutaggregated entities, and also to store annotations and activitydata.

5Each component employs standard protocols and formats.For example, we can make use of any capable quad-storeas our aggregate store.

Linked data

crawlerAnansi Aggregation

engineSpindle

Full-text store

Aggregate store

Minimal browse interface &

APIs

Quilt

Activity storedigitised

resources and their

RDF metadata

Acropolis

User

crawls and indexes

asks for relevant

resources

Page 4: Research and Education Space: what are we going to be tangled up with?

• A LOD platform to collect, index, and organise metadata about publicly-held archives (and more) !

!

!

!

!

• Challenges ahead! (some solved by making use of LOD)

Acropolis

digitised resources and their

RDF metadata

Acropolis

User

crawls and indexes

asks for relevant

resources

exploits and

“informs”

Core Platform: “Acropolis”

Project RES: Technical Approach

1The crawler fetches data via HTTP from publishedsources. Once retrieved, it is indexed by the full-textstore and passed to the aggregation engine for evaluation.

2

The results of the aggregation engine's evaluation processare stored in the aggregate store, which contains minimalbrowse information and information about the similarity ofentities.

3

The public face of the core platform is an extremely basicbrowsing interface (which presents the data in tabular formto aid application developers), and read-write RESTful APIs.

4Applications may use the APIs to locate information aboutaggregated entities, and also to store annotations and activitydata.

5Each component employs standard protocols and formats.For example, we can make use of any capable quad-storeas our aggregate store.

Linked data

crawlerAnansi Aggregation

engineSpindle

Full-text store

Aggregate store

Minimal browse interface &

APIs

Quilt

Activity store

• A LOD platform to collect, index, and organise metadata about publicly-held archives (and more) !

!

!

!

!

• Challenges ahead! (some solved by making use of LOD)

Page 5: Research and Education Space: what are we going to be tangled up with?

1. Which metadata?• Currently, resources metadata mostly oriented

towards “physical proximity” i.e., indexes reflect similarity of author’s surname, broad subject, format, media, etc.

• Heterogeneous platforms and data models incompatibility, transformations needed

• Even when RDF is used, there’s a proliferation of terms, vocabularies, formats adopted little (if any) validation

Page 6: Research and Education Space: what are we going to be tangled up with?

2. Linking

• Systems that do not use RDF do not allow collection holders to express their knowledge as they wish underspecified knowledge

• Even when RDF is used, information often provided as literals rather than links to URIs ad hoc solutions unavailable in a machine-readable format

Page 7: Research and Education Space: what are we going to be tangled up with?

3. Usability• Search quality: efficiency, precision, recall

• Reliability

• Lack of toolsdevelopers have little contact with collection holders

• Licensing issuesresources licensing (not always explicit)metadata licensingusers need to be aware of what that mean(note that in educations things are slightly easier - blanket licensing etc.)

Page 8: Research and Education Space: what are we going to be tangled up with?

How does RES help?1. “Which metadata” issue:

• data model oriented towards the content • technical support in the generation of RDF metadata • controlled vocabulary

(Acropolis will focus on a selection of predicates and their equivalent terms) 2. “Linking”

• recommend existing relevant vocabularies or datasets to link to • strong recommendation to link to resources whenever possible

3. “Usability” • Resources are indexed in Acropolis to be efficiently retrieved • Enabling users to make use of stable URIs, and to coreference to equivalent resources • Connected Studio • Requirement: both individual resources and their metadata need to be explicitly

licensed, and tools developed on top of Acropolis should take into account this info

• BBC - Jisc - BUFVC: Authority, Motivation

Page 9: Research and Education Space: what are we going to be tangled up with?

How to get involved?

• for collection holders: get in touch!

• for developers: look here!http://acropolis.org.uk

• for both:http://bbcarchdev.github.io/inside-acropolis/

Page 10: Research and Education Space: what are we going to be tangled up with?

What are we missing?

• … we welcome any experience / suggestion!

• perhaps collections? (please let us know!)

[email protected]

[email protected]