Copyright 2009 Digital Enterprise Research Institute. All rights reserved. Digital Enterprise...
-
Upload
breonna-keyt -
Category
Documents
-
view
216 -
download
1
Transcript of Copyright 2009 Digital Enterprise Research Institute. All rights reserved. Digital Enterprise...
Copyright 2009 Digital Enterprise Research Institute. All rights reserved.
Digital Enterprise Research Institute www.deri.ie
Social People-Tagging vs.Social Bookmark-Tagging
Peyman Nasirifard, Sheila Kinsella, Krystian Samp, Stefan Decker
Digital Enterprise Research Institute www.deri.ie
Bookmark-tagging and People-tagging
todo
research
nlp
technician
friendly
music
Digital Enterprise Research Institute www.deri.ie
Motivation
Understand better how people tag each other
A starting point for tag recommendation in frameworks based on people-tagging Access control mechanisms Information filtering mechanisms
We are especially interested in subjectivity of tags
Digital Enterprise Research Institute www.deri.ie
Main questions
How do tags differ for resources of different categories? (person, event, country and city)
How do tags for Wikipedia pages about persons differ from tags for friends?
How do tags differ with age, gender of taggee?
Digital Enterprise Research Institute www.deri.ie
Data collection
1. Bookmark tags Wikipedia articles: Person, Event, Country,
City
Digital Enterprise Research Institute www.deri.ie
Data collection
2. People tags http://blog.* network of blog sites
.ca, .co.uk, .de, .fr
Google Translate to convert non-English to English
Digital Enterprise Research Institute www.deri.ie
Dataset
Source Category
# Items # Tags # Unique
Wikipedia Person 4,031 75,548 14,346
Event 1,427 8,924 2,582
Country 638 13,002 3,200
City 1,137 4,703 1,907
Blog sites Friend 2,927 17,126 10,913
Digital Enterprise Research Institute www.deri.ie
Person Event Country City
wikipedia history wikipedia travel
people war history wikipedia
philosophy wikipedia travel italy
history ww2 geography germany
wiki politics africa history
music wiki culture london
politics military wiki uk
art battle reference wiki
books wwii europe places
literature iraq country england
Top tags – Wikipedia articles
Digital Enterprise Research Institute www.deri.ie
.de .fr .ca & .co.uk
music junkie art funny
nice politics music
live music life
funny kind kk friend
dear adorable funky
intelligent love friendly
pretty nice lovely
sexy drawing cool
love friendship sexy
honest trustworthy love
Top tags – blog sites
Digital Enterprise Research Institute www.deri.ie
Distribution of tags
Digital Enterprise Research Institute www.deri.ie
Top 100 tags for each category 25 annotators each categorised 100 tags
Objective e.g. “london” Subjective e.g. “jealous” Uncategorised e.g. “abcxyz”
Average inter-annotator agreement: 86%
Subjectivity of tags
Digital Enterprise Research Institute www.deri.ie
Friend Person Country City Event
subjectiveobjectiveuncategorized
Digital Enterprise Research Institute www.deri.ie
Randomly selected tags
Before we looked at top tags, but what about long-tail tags?
We also asked annotators to categorise 100 randomly chosen tags from each group Much higher rate of uncategorised (~3x) Lower inter-annotator agreement (76%) Less clear a meaning than the top tags, so
probably less useful for applications like information filtering
Digital Enterprise Research Institute www.deri.ie
Linguistic categories
Automatic classification (WordNet) Noun/verb/adjective/adverb/uncategorised
Digital Enterprise Research Institute www.deri.ie
Adjective Adverb Verb Noun Uncategorised
Digital Enterprise Research Institute www.deri.ie
Age and gender of taggees
Generated sets of tags corresponding to ages brackets and genders Removed tags that refer to a specific gender
Asked 10 participants if they could predict age and gender
Results: Differences between gender were not
perceptible Differences between younger and older were
perceptible (and younger were more subjective)
Digital Enterprise Research Institute www.deri.ie
Conclusions
Subjectivity: Articles of different categories are tagged similarly, but friends are assigned subjective tags more frequently
Consequence: frameworks built on person-tags will need to handle more potentially unreliable tags Controlled vocabularies?
Future work: Twitter Lists as person annotations for information filtering