Challenges on modeling annotations in the europeana sounds project

18
Challenges on modeling annotations in the Europeana Sounds project Hugo Manguinhas, Sergiu Gordea, Antoine Isaac, Alessio Piccioli, Giulio Andreini, Francesca Di Donato, Remy Gardien, Maarten Brinkerink | iAnnotate 2016

Transcript of Challenges on modeling annotations in the europeana sounds project

Page 1: Challenges on modeling annotations in the europeana sounds project

Challenges on modeling annotations in the Europeana Sounds projectHugo Manguinhas, Sergiu Gordea, Antoine Isaac, Alessio Piccioli, Giulio Andreini, Francesca Di Donato, Remy Gardien, Maarten Brinkerink | iAnnotate 2016

Antoine Isaac
In relation with previous comment on sl 8, what strikes me on this slide is that it doesn't show the object being annotated. People in the audience may be confused.
Page 2: Challenges on modeling annotations in the europeana sounds project

What is Europeana?

Challenges on modeling annotations in the Europeana Sounds projectCC BY-SA

We aggregate metadata:

• From all EU countries

• 3,500 galleries, libraries, archives and museums

• More than 52M objects

• In about 50 languagesEuropeana aggregation infrastructure

Europeana| CC BY-SA

The Platform for Europe’s Digital Cultural Heritage

Page 3: Challenges on modeling annotations in the europeana sounds project

Why are annotations useful?

CC BY-SA

For Users, a means to…• Contribute with their knowledge• Discuss and share their knowledge with others

For Cultural Institutions, a new way and opportunity to increase the quality of their metadata • Improve consistency• Contribute to a better semantic description, with internal

cross-linking and links to the web of data

Challenges on modeling annotations in the Europeana Sounds project

Page 4: Challenges on modeling annotations in the europeana sounds project

The Europeana Sounds project

CC BY-SA

Europeana Sounds aims to increase the amount of audio content available via Europeana

• also improving geographical and thematic coverage

Apart from aggregation, it improves discovery and use of audio content, by enriching metadata through innovative methods

Challenges on modeling annotations in the Europeana Sounds project

Page 5: Challenges on modeling annotations in the europeana sounds project

Annotation Scenarios in Europeana Sounds

CC BY-SA

A user annotates a Cultural Heritage Object, in particular…• Information describing the object (i.e. metadata)

• Contextual information (i.e. metadata about Agents, Places, Subjects, … )

• Media resources representing the object

By the following actions:• Tag with terms from controlled vocabularies• Complete or correct information• Favour or moderate annotations made by other

users• Comment and discuss with other users• Relate objects together

Challenges on modeling annotations in the Europeana Sounds project

Page 6: Challenges on modeling annotations in the europeana sounds project

Crowdsourcing Infrastructure

CC BY-SA

Challenges on modeling annotations in the Europeana Sounds project

Annotation Providers

Annotation Client

TheSession.org + TunePal

HistoryPin.org Pundit WITH

Page 7: Challenges on modeling annotations in the europeana sounds project

Exchanging annotations across platforms

CC BY-SA

We adopted the W3C Web Annotation Data Model• Offers a simple model for exchanging annotations across

platforms... but flexible enough to support complex scenarios

We are developing a REST API based on the W3C Web Annotation Protocol• Which developers & Europeana will use for retrieval,

creation and search of annotations

Challenges on modeling annotations in the Europeana Sounds project

Page 8: Challenges on modeling annotations in the europeana sounds project

Tagging with semantic resources

CC BY-SA

User scenario:A end-user wishes to tag a Europeana object using a term/resource from a controlled vocabulary

Challenges on modeling annotations in the Europeana Sounds project

Item as displayed in theEuropeana Collections

Portal

Antoine Isaac
Actually re-reading after seeing Maarten's comment on sl 12, I think sl 8, 9 and 10 may be too generic. One doesn't see what kind of objects are being annotated, what semantic tags are (which vocabulary? even an example would help, not necessary all the vocs Sounds may be interested in)
Page 9: Challenges on modeling annotations in the europeana sounds project

Tagging with semantic resourcesThe Pundit use case

CC BY-SA

Challenges on modeling annotations in the Europeana Sounds project

DBpediaAPI

oa:Annotationhttp://

data.europeana.eu/annotation/...

oa:hasBody

skos:Concepthttp://dbpedia.org/resource/

Brass_instrument

oa:taggingoa:motivatedBy

edm:ProvidedCHOhttp://data.europeana.eu/item/09102/

_UEDIN_214

oa:hasTarget

Available Vocabularies/ Datasets

Page 10: Challenges on modeling annotations in the europeana sounds project

Tagging with semantic resourcesOpen Questions

CC BY-SA

… some aspects may significantly influence user experience:

• Should different kinds of semantic resources be displayed in the same way? ...or should they be differentiated by type (ie. a Place vs Agent) or scope (e.g. Rock as a sound genre vs a physical thing).

• Which label(s) should be displayed? Should the one that best fits the display settings (ie. language preferences) be used - what if no label exists for that language?

• Should the user annotate with any term from a vocabulary, or only a subset?

• What to do with annotations when a vocabulary is updated by its maintainer?

Challenges on modeling annotations in the Europeana Sounds project

Page 11: Challenges on modeling annotations in the europeana sounds project

Tagging with semantic resourcesChallenges

Client applications must: • have all data necessary for feeding the display and • in an way that they can uniformly process

This means that resources must be:• Dereferenced• Translated into an uniform data format

To tackle these challenges we chose the Europeana Data Model:• Already in use at Europeana• Reuses existing standards (e.g. SKOS, DublinCore, WGS84 Geo

Positioning)

• Gives support for all contextual resources CC BY-SA

Challenges on modeling annotations in the Europeana Sounds project

Page 12: Challenges on modeling annotations in the europeana sounds project

Tagging with geographical informationThe HistoryPin.org use case

CC BY-SA

Challenges on modeling annotations in the Europeana Sounds project

User can specify the rough coordinates and a radius

An Europeana Object displayed in HistoryPin.org

Page 13: Challenges on modeling annotations in the europeana sounds project

Tagging with geographical informationThe HistoryPin.org use case

CC BY-SA

Challenges on modeling annotations in the Europeana Sounds project

{ "@context": "http://www.w3.org/ns/anno.jsonld", "id": "http://data.europeana.eu/annotations/historypin/136290", ... [provenance info here], ... "motivation": "tagging", "body": { "@context": <Context for EDM>, "type": "edm:Place", "wgs84_pos:lat": "48.85341", "wgs84_pos:long": "2.3488" }, "target": "http://data.europeana.eu/item/09102/_UEDIN_214"}

Similar as semantic tagging but using a “virtual” resource

Coordinates expressed using WGS84 Geo

Positioning

Page 14: Challenges on modeling annotations in the europeana sounds project

Annotating metadataThe Pundit use case

CC BY-SA

We consider metadata annotations as…• any annotation that refers to or asserts a statement to the

information describing an object in order to complete or correct it

Ideally, and like other annotations, they should be• agnostic to the way they are presented to the user in the

interface• machine readable

So that metadata annotations can• survive changes to the interface design;• allow them to be easily shared outside the interface they

were originally created;• allow for other software applications to take further

advantage of it

Challenges on modeling annotations in the Europeana Sounds project

Hugo Manguinhas
I am lost here... I understand that it would be better to have the HP case, but the HP case is not an actual semantic tagging use case... so, what shall I do here?
Maarten Brinkerink
1. I noticed that too, think we can get rid of that right?2. I think the specific application in the project is a very clear illustration of what the Crowdsourcing Infrastructure/Annotations API allows us to do at the client level.3. This is not exclusive to HP in my view, it's the general idea of the Crowdsourcing Infrastructure/Annotations API. So IMHO we can still make that point.And I agree its a more clear examples. So I say go for it ;)
Antoine Isaac
I don't understand. (1) There's still one slides sub-titled 'historypin'. (2) Is there value in doing more about Pundit, while we already have another Pundit presentation in the session? (3) Talking about HP is the opportunity to make the point for a REST API (to which several clients connect), which was one of the ideas that Remy called for...This said if we're talking about sl 8 and sl 9, it is a clear example, clearer than what the HP example was.
Maarten Brinkerink
Cool!
Hugo Manguinhas
I have replaced the HistoryPin example with the musical instrument crowdsourcing with Pundit
Hugo Manguinhas
My idea was to show the metadata annotation use case... in Pundit/WITH campaign only semantic tagging was used... I could replace the HistoryPin one with the Pundit/WITH use case if you can send me screenshots that I can use.
Antoine Isaac
If Giulio is going to present the use case in his presentation then the slides about the pundit cases in Hugo's presentation can stay quite generic. But I think I see your point. It would be about focusing 'Pundit-related questions' just on the music instrument crowdsourcing? If yes then I'd agree with doing this.
Maarten Brinkerink
Can't we make this more specific (like Historypin)? So focus on the Crowdsourcing Campaign with Pundit/WITH on musical instruments?
Giulio Andreini
Sounds OK for me
Antoine Isaac
sl 13-14 probably can be shortened if the Pundit presentation is going to be given in more detail the same day (perhaps even the same session?)
Giulio Andreini
"Pundit.it" is not correct:please use simply "The Pundit use case".
Antoine Isaac
Actually there could be a way to simplify the points below 'Ideally, and like other annotations, they should be' and 'So that metadata annotations can'. These are generic requirements that should have been stated already somewhere, shouldn't they?
Page 15: Challenges on modeling annotations in the europeana sounds project

Annotating metadataA Proposal

CC BY-SA

Challenges on modeling annotations in the Europeana Sounds project

oa:Annotationhttp://

data.europeana.eu/annotation/...

oa:describing?

oa:motivatedBy

pundit:MetadataSelector

#statement1

oa:SpecificResource

#metadata1

oa:hasTargetoa:hasSelector

rdf:predicate

Graph

Correct URIedm:ProvidedCHO

http://data.europeana.eu/item/09102/_UEDIN_214

dcterms:isPartOf

oa:hasSource

rdf:value

oa:hasBody

A specific motivation may

be needed

Similar to a rdf:Statement but

following WA guidelines

Page 16: Challenges on modeling annotations in the europeana sounds project

Favouring and moderating annotations

CC BY-SA

As manual per-annotation moderation does not scale well, we wish to encourage a crowd-moderation policy among the end-users:• Three-strikes-out: if three users report an

annotation as in violation of the terms of use, it will be hidden.

How to differentiate moderation (violations of terms of use) from up- and down-voting ('this is a very good annotation, +1')?

Challenges on modeling annotations in the Europeana Sounds project

Hugo Manguinhas
Since the presentation has grown a lot, I will vote to keep it simple and leave it for a future presentation.
Maarten Brinkerink
I feel point 2 and 3 at least need to be included/in line with this slide.
Maarten Brinkerink
This is what we wrote down:1. User Generated Annotations are always stored separately from the original metadata records in the Annotations API. There they are related to the metadata object, as harvested by Europeana. This means enrichments resulting from crowdsourcing will never automatically alter the original metadata record as provided by the Data Provider.2. End-users of the various crowdsourcing platforms connected to the crowdsourcing infrastructure can only create new annotations, or comment on annotations from other end-users. These new annotations are then stored in the Annotations API, in accordance with the first principle. Comments on annotations by others end-users are related to the original annotations in the Annotation API. End-users can select to make their annotations public, or keep them private. Editing or deleting annotations by other end- users is not possible as an end-user.3. Instead of supporting end-users with the editing or deleting of annotations by other end-users, we support the evaluation (e.g. ‘flagging’ and ‘liking’) of annotations from other end-users. This aims to encourage crowd-moderation among the end-users. This will support users with the possibility to flag potentially offensive, libellous or spam enrichments. On the other hand it also aims at allowing users to express their explicit support for an already existing enrichment. The exact labels for and types of evaluation will result from user testing. Part of this user research will also be the question whether end- users want to add an evaluation to their own tags, for instance to express a level of certainty about their own annotation. Similarly to the comments mentioned above, these evaluations will be related to the original annotations in the Annotations API.4. The target of semantic enrichments is restricted to resources from trusted repositories, in order to counter the spamming of links.5. Utilization of the annotations that can be retrieved from the Annotations API is left up to the policy of the respe
Hugo Manguinhas
@Maarten, could you check this and add what you feel is missing and makes sense for this slide.
Maarten Brinkerink
Is this in line with the policy we put in the deliverable? I feel that was a bit more elaborate...
Remy Gardien
To explain our view of moderation (focus on the scalability) and to pose the question which we are and were also struggling with: how to differentiate moderation from up- and downvoting (liking etc.).
Sergiu Gordea
becasue of Three-strikes-out rule, the client could choose not to display annotations with 1-2 reports, the 3+ are hidden anyway.
Page 17: Challenges on modeling annotations in the europeana sounds project

Conclusion

• Requirements are becoming clearer as we work on more concrete use-cases and validate them with real users

• Expressing cross-platform annotations in an uniform way is a big challenge:• W3C Web Annotation Data Model gives a good

interoperable base• But, not all scenarios are yet covered• Need for best practices for specific applications /

domains

• Still a lot of work ahead......but we are making progress

CC BY-SA

Challenges on modeling annotations in the Europeana Sounds project

Hugo Manguinhas
Is it good like this?
Remy Gardien
Maybe add here that real use-cases are the things that help us the most forward in this, as they dictate our requirements.
Page 18: Challenges on modeling annotations in the europeana sounds project

http://www.europeanasounds.eu/