Challenges on modeling annotations in the europeana sounds project

Challenges on modeling annotations in the Europeana Sounds projectHugo Manguinhas, Sergiu Gordea, Antoine Isaac, Alessio Piccioli, Giulio Andreini, Francesca Di Donato, Remy Gardien, Maarten Brinkerink | iAnnotate 2016

Antoine Isaac

In relation with previous comment on sl 8, what strikes me on this slide is that it doesn't show the object being annotated. People in the audience may be confused.

What is Europeana?

Challenges on modeling annotations in the Europeana Sounds projectCC BY-SA

We aggregate metadata:

• From all EU countries

• 3,500 galleries, libraries, archives and museums

• More than 52M objects

• In about 50 languagesEuropeana aggregation infrastructure

Europeana| CC BY-SA

The Platform for Europe’s Digital Cultural Heritage

http://pro.europeana.eu/share-your-data/

Why are annotations useful?

CC BY-SA

For Users, a means to…• Contribute with their knowledge• Discuss and share their knowledge with others

For Cultural Institutions, a new way and opportunity to increase the quality of their metadata • Improve consistency• Contribute to a better semantic description, with internal

cross-linking and links to the web of data

Challenges on modeling annotations in the Europeana Sounds project

The Europeana Sounds project

CC BY-SA

Europeana Sounds aims to increase the amount of audio content available via Europeana

• also improving geographical and thematic coverage

Apart from aggregation, it improves discovery and use of audio content, by enriching metadata through innovative methods


Annotation Scenarios in Europeana Sounds

CC BY-SA

A user annotates a Cultural Heritage Object, in particular…• Information describing the object (i.e. metadata)

• Contextual information (i.e. metadata about Agents, Places, Subjects, … )

• Media resources representing the object

By the following actions:• Tag with terms from controlled vocabularies• Complete or correct information• Favour or moderate annotations made by other

users• Comment and discuss with other users• Relate objects together


Crowdsourcing Infrastructure

CC BY-SA


Annotation Providers

Annotation Client

TheSession.org + TunePal

HistoryPin.org Pundit WITH

Exchanging annotations across platforms

CC BY-SA

We adopted the W3C Web Annotation Data Model• Offers a simple model for exchanging annotations across

platforms... but flexible enough to support complex scenarios

We are developing a REST API based on the W3C Web Annotation Protocol• Which developers & Europeana will use for retrieval,

creation and search of annotations


http://www.w3.org/TR/annotation-model/

https://www.w3.org/TR/annotation-protocol/

Tagging with semantic resources

CC BY-SA

User scenario:A end-user wishes to tag a Europeana object using a term/resource from a controlled vocabulary


Item as displayed in theEuropeana Collections

Portal

Antoine Isaac

Actually re-reading after seeing Maarten's comment on sl 12, I think sl 8, 9 and 10 may be too generic. One doesn't see what kind of objects are being annotated, what semantic tags are (which vocabulary? even an example would help, not necessary all the vocs Sounds may be interested in)

Tagging with semantic resourcesThe Pundit use case

CC BY-SA


DBpediaAPI

oa:Annotationhttp://

data.europeana.eu/annotation/...

oa:hasBody

skos:Concepthttp://dbpedia.org/resource/

Brass_instrument

oa:taggingoa:motivatedBy

edm:ProvidedCHOhttp://data.europeana.eu/item/09102/

_UEDIN_214

oa:hasTarget

Available Vocabularies/ Datasets

Tagging with semantic resourcesOpen Questions

CC BY-SA

… some aspects may significantly influence user experience:

• Should different kinds of semantic resources be displayed in the same way? ...or should they be differentiated by type (ie. a Place vs Agent) or scope (e.g. Rock as a sound genre vs a physical thing).

• Which label(s) should be displayed? Should the one that best fits the display settings (ie. language preferences) be used - what if no label exists for that language?

• Should the user annotate with any term from a vocabulary, or only a subset?

• What to do with annotations when a vocabulary is updated by its maintainer?


Tagging with semantic resourcesChallenges

Client applications must: • have all data necessary for feeding the display and • in an way that they can uniformly process

This means that resources must be:• Dereferenced• Translated into an uniform data format

To tackle these challenges we chose the Europeana Data Model:• Already in use at Europeana• Reuses existing standards (e.g. SKOS, DublinCore, WGS84 Geo

Positioning)

• Gives support for all contextual resources CC BY-SA


Tagging with geographical informationThe HistoryPin.org use case

CC BY-SA


User can specify the rough coordinates and a radius

An Europeana Object displayed in HistoryPin.org

Tagging with geographical informationThe HistoryPin.org use case

CC BY-SA


{ "@context": "http://www.w3.org/ns/anno.jsonld", "id": "http://data.europeana.eu/annotations/historypin/136290", ... [provenance info here], ... "motivation": "tagging", "body": { "@context": <Context for EDM>, "type": "edm:Place", "wgs84_pos:lat": "48.85341", "wgs84_pos:long": "2.3488" }, "target": "http://data.europeana.eu/item/09102/_UEDIN_214"}

Similar as semantic tagging but using a “virtual” resource

Coordinates expressed using WGS84 Geo

Positioning

Annotating metadataThe Pundit use case

CC BY-SA

We consider metadata annotations as…• any annotation that refers to or asserts a statement to the

information describing an object in order to complete or correct it

Ideally, and like other annotations, they should be• agnostic to the way they are presented to the user in the

interface• machine readable

So that metadata annotations can• survive changes to the interface design;• allow them to be easily shared outside the interface they

were originally created;• allow for other software applications to take further

advantage of it


Hugo Manguinhas

I am lost here... I understand that it would be better to have the HP case, but the HP case is not an actual semantic tagging use case... so, what shall I do here?

Maarten Brinkerink

1. I noticed that too, think we can get rid of that right?2. I think the specific application in the project is a very clear illustration of what the Crowdsourcing Infrastructure/Annotations API allows us to do at the client level.3. This is not exclusive to HP in my view, it's the general idea of the Crowdsourcing Infrastructure/Annotations API. So IMHO we can still make that point.And I agree its a more clear examples. So I say go for it ;)

Antoine Isaac

I don't understand. (1) There's still one slides sub-titled 'historypin'. (2) Is there value in doing more about Pundit, while we already have another Pundit presentation in the session? (3) Talking about HP is the opportunity to make the point for a REST API (to which several clients connect), which was one of the ideas that Remy called for...This said if we're talking about sl 8 and sl 9, it is a clear example, clearer than what the HP example was.

Maarten Brinkerink

Cool!

Hugo Manguinhas

I have replaced the HistoryPin example with the musical instrument crowdsourcing with Pundit

Hugo Manguinhas

My idea was to show the metadata annotation use case... in Pundit/WITH campaign only semantic tagging was used... I could replace the HistoryPin one with the Pundit/WITH use case if you can send me screenshots that I can use.

Antoine Isaac

If Giulio is going to present the use case in his presentation then the slides about the pundit cases in Hugo's presentation can stay quite generic. But I think I see your point. It would be about focusing 'Pundit-related questions' just on the music instrument crowdsourcing? If yes then I'd agree with doing this.

Maarten Brinkerink

Can't we make this more specific (like Historypin)? So focus on the Crowdsourcing Campaign with Pundit/WITH on musical instruments?

Giulio Andreini

Sounds OK for me

Antoine Isaac

sl 13-14 probably can be shortened if the Pundit presentation is going to be given in more detail the same day (perhaps even the same session?)

Giulio Andreini

"Pundit.it" is not correct:please use simply "The Pundit use case".

Antoine Isaac

Actually there could be a way to simplify the points below 'Ideally, and like other annotations, they should be' and 'So that metadata annotations can'. These are generic requirements that should have been stated already somewhere, shouldn't they?

Annotating metadataA Proposal

CC BY-SA


oa:Annotationhttp://

data.europeana.eu/annotation/...

oa:describing?

oa:motivatedBy

pundit:MetadataSelector

#statement1

oa:SpecificResource

#metadata1

oa:hasTargetoa:hasSelector

rdf:predicate

Graph

Correct URIedm:ProvidedCHO

http://data.europeana.eu/item/09102/_UEDIN_214

dcterms:isPartOf

oa:hasSource

rdf:value

oa:hasBody

A specific motivation may

be needed

Similar to a rdf:Statement but

following WA guidelines

Favouring and moderating annotations

CC BY-SA

As manual per-annotation moderation does not scale well, we wish to encourage a crowd-moderation policy among the end-users:• Three-strikes-out: if three users report an

annotation as in violation of the terms of use, it will be hidden.

How to differentiate moderation (violations of terms of use) from up- and down-voting ('this is a very good annotation, +1')?


Hugo Manguinhas

Since the presentation has grown a lot, I will vote to keep it simple and leave it for a future presentation.

Maarten Brinkerink

I feel point 2 and 3 at least need to be included/in line with this slide.

Maarten Brinkerink

This is what we wrote down:1. User Generated Annotations are always stored separately from the original metadata records in the Annotations API. There they are related to the metadata object, as harvested by Europeana. This means enrichments resulting from crowdsourcing will never automatically alter the original metadata record as provided by the Data Provider.2. End-users of the various crowdsourcing platforms connected to the crowdsourcing infrastructure can only create new annotations, or comment on annotations from other end-users. These new annotations are then stored in the Annotations API, in accordance with the first principle. Comments on annotations by others end-users are related to the original annotations in the Annotation API. End-users can select to make their annotations public, or keep them private. Editing or deleting annotations by other end- users is not possible as an end-user.3. Instead of supporting end-users with the editing or deleting of annotations by other end-users, we support the evaluation (e.g. ‘flagging’ and ‘liking’) of annotations from other end-users. This aims to encourage crowd-moderation among the end-users. This will support users with the possibility to flag potentially offensive, libellous or spam enrichments. On the other hand it also aims at allowing users to express their explicit support for an already existing enrichment. The exact labels for and types of evaluation will result from user testing. Part of this user research will also be the question whether end- users want to add an evaluation to their own tags, for instance to express a level of certainty about their own annotation. Similarly to the comments mentioned above, these evaluations will be related to the original annotations in the Annotations API.4. The target of semantic enrichments is restricted to resources from trusted repositories, in order to counter the spamming of links.5. Utilization of the annotations that can be retrieved from the Annotations API is left up to the policy of the respe

Hugo Manguinhas

@Maarten, could you check this and add what you feel is missing and makes sense for this slide.

Maarten Brinkerink

Is this in line with the policy we put in the deliverable? I feel that was a bit more elaborate...

Remy Gardien

To explain our view of moderation (focus on the scalability) and to pose the question which we are and were also struggling with: how to differentiate moderation from up- and downvoting (liking etc.).

Sergiu Gordea

becasue of Three-strikes-out rule, the client could choose not to display annotations with 1-2 reports, the 3+ are hidden anyway.

Conclusion

• Requirements are becoming clearer as we work on more concrete use-cases and validate them with real users

• Expressing cross-platform annotations in an uniform way is a big challenge:• W3C Web Annotation Data Model gives a good

interoperable base• But, not all scenarios are yet covered• Need for best practices for specific applications /

domains

• Still a lot of work ahead......but we are making progress

CC BY-SA


Hugo Manguinhas

Is it good like this?

Remy Gardien

Maybe add here that real use-cases are the things that help us the most forward in this, as they dictate our requirements.

http://www.europeanasounds.eu/

Challenges on modeling annotations in the europeana sounds project

Presentations & Public Speaking

Transcript of Challenges on modeling annotations in the europeana sounds project