Social Event Detection at MediaEval 2012: Challenges, Dataset and Evaluation
-
Upload
mediaeval2012 -
Category
Technology
-
view
943 -
download
0
description
Transcript of Social Event Detection at MediaEval 2012: Challenges, Dataset and Evaluation
Social Event Detection (SED): Challenges, Dataset and
EvaluationRaphaël Troncy <[email protected]> Vasileios Mezaris <[email protected]>Symeon Papadopoulos <[email protected]>Emmanouil Schinas <[email protected]> Ioannis Kompatsiaris <[email protected]>
What are Events?
Events are observable occurrences grouping
04/10/2012 - Social Event Detection (SED) Task - MediaEval 2012, Pisa, Italy - 2
Experiences documented by Media
People Places Time
SED: bigger, longer, harder
In 20112 challenges73k photos (2,43 Gb)No training dataset18 teams interested7 teams submitted
runs
Considered easyF-measure = 85%
(challenge 1)F-measure = 69%
(challenge 2)
In 20123 challenges
1 from SED 2011167k photos (5,5 Gb)
cc licence checkTraining dataset =
SED 201121 teams interested
… from 15 countries5 teams submitted
runs
Much harder !
- 3 04/10/2012 - Social Event Detection (SED) Task - MediaEval 2012, Pisa, Italy
Three challenges (type and venue)
1. Find all technical events that took place in Germany in the test collection.
2. Find all soccer events taking place in Hamburg (Germany) and Madrid (Spain) in the collection.
3. Find all demonstration and protest events of the Indignados movement occurring in public places in Madrid in the collection
For each event, we provided relevant and non relevant example photos
Task = detect events and provide all illustrating photos
04/10/2012 - Social Event Detection (SED) Task - MediaEval 2012, Pisa, Italy - 4
Dataset Construction
Collect 167332 Flickr Photos (Jan 2009-Dec 2011)4,422 unique Flickr users, all in CC licenceAll geo-tagged in 5 cities: Barcelona (72255),
Cologne (15850), Hannover (2823), Hamburg (16958), Madrid (59043) + 0,22 % (403) from EventMedia
Altered metadata:geo-tags removed for 80% of the photos
(random)33466 photos still geo-tagged
Provide only metadata … but real media were available to participants if they asked (5,5 Gb)
04/10/2012 - Social Event Detection (SED) Task - MediaEval 2012, Pisa, Italy - 5
Ground Truth and Evaluation Measures CrEve annotation tool:
http://www.clusttour.gr/creve/For each of the 6 collections, review all photos
and associate them to events (that have to be created)
Search by text, geo-coordinates, date and userReview annotations made by othersUse EventMedia and machine tags
(upcoming:event=xxx)
Evaluation Measures:Harmonic mean (F-score of Precision and Recall)Normalized Mutual Information (NMI): jointly
consider the goodness of the photos retrieved and their correct assignment to different events04/10/2012 - Social Event Detection (SED) Task - MediaEval 2012, Pisa, Italy - 6
What ideally should be found
Challenge 1:19 events, 2234 photos (avg = 117)Baseline precision (random): 0,01%
Challenge 2:79 events, 1684 photos (avg = 21)Baseline precision (random): 0,01%
Challenge 3:52 events, 3992 Photos (avg = 77)Baseline precision (random): 0,02%
04/10/2012 - Social Event Detection (SED) Task - MediaEval 2012, Pisa, Italy - 7
Who Has Participated ?
21 Teams registered (18 in 2011)
5 Teams cross the lines (7 in 2011, 2 overlaps)
One participant missing at the workshop!
04/10/2012 - Social Event Detection (SED) Task - MediaEval 2012, Pisa, Italy - 8
Quick Summary of Approaches
2011: all but 1 participants use background knowledge Last.fm (all), Fbleague (EURECOM), PlayerHistory (QMUL) DBpedia, Freebase, Geonames, WordNet
2012: all but 2 participants use a generic approach IR approach: query matching clusters (metadata, temporal,
spatial): MISIMIS Classification approach:
Topic detection with LDA, city classification with TF-IDF, event detection using peaks in timeline using the query topics: AUTH-ISSEL
Learning model using the training data and SVM: CERTH-ITI
Background knowledge: QMUL, DISI
2012: all approaches are NOT fully automatic Manual selection of some parameters (e.g. topics)
04/10/2012 - Social Event Detection (SED) Task - MediaEval 2012, Pisa, Italy - 9
Results – Challenge 1 (Technical Events)
04/10/2012 - Social Event Detection (SED) Task - MediaEval 2012, Pisa, Italy - 10
Precision Recall F-score NMI
AUTHISSEL_4 76,29 94,9 84,58 0,7238
CERTH_1 43,11 11,91 18,66 0,1877
DISI_1 86,23 59,13 70,15 0,6011
MISIMS_2 2,52 1,88 2,15 0,0236
QMUL_4 3,86 12,85 5,93 0,0475
Runs0
10
20
30
40
50
60
70
80
90 84.58
18.66
70.15
2.155.93
AUTHISSEL_4 CERTHITI_1 DISI_1 MISIMS_2 QMUL_4
Results – Challenge 2 (Soccer Events)
04/10/2012 - Social Event Detection (SED) Task - MediaEval 2012, Pisa, Italy - 11
Precision Recall F-score NMI
AUTHISSEL_4 88,18 93,49 90,76 0,8499
CERTH_1 85,57 66,19 74,64 0,6745
DISI_1
MISIMS_2 34,49 17,25 22,99 0,1993
QMUL_4 79,04 67,12 72,59 0,6493
Runs0
10
20
30
40
50
60
70
80
90
10090.76
74.64
22.99
72.59
AUTHISSEL_4 CERTHITI_3 DISI_1 MISIMS_2 QMUL_1
Results – Challenge 3 (Indignados Events)
04/10/2012 - Social Event Detection (SED) Task - MediaEval 2012, Pisa, Italy - 12
Precision Recall F-score NMI
AUTHISSEL_4 88,91 90,78 89,83 0,738
CERTH_1 86,24 54,61 66,87 0,4654
DISI_1 86,15 47,17 60,96 0,4465
MISIMS_2 48,3 46,87 47,58 0,3088
QMUL_4 22,88 33,48 27,19 0,1988
Runs0
10
20
30
40
50
60
70
80
90
10089.83
66.8760.96
47.58
27.19
AUTHISSEL_4 CERTHITI_3 DISI_1 MISIMS_2 QMUL_4
Conclusion
Lessons learnedClear winner for all tasks: generic approach but
manual selection of the topicsUse of background knowledge still useful if well-
used
Looking at next year SEDShlomo Geva (Queensland University of
Technology) + Philipp Cimiano (University of Bielefeld)
Dataset: bigger, more diverseMedia: photos and videos ? (at least 10% videos?)Metadata: include some social network
relationships, participation at eventsEvaluation measures: event granularity?
Time/CPU?
04/10/2012 - Social Event Detection (SED) Task - MediaEval 2012, Pisa, Italy - 13