Event Clusters Detection on Flickr Images using a Suffix-Tree Structure
-
Upload
massimiliano-ruocco -
Category
Technology
-
view
614 -
download
2
description
Transcript of Event Clusters Detection on Flickr Images using a Suffix-Tree Structure
1
Event Cluster Detection on Flickr Images using a Suffix-Tree Structure
Massimiliano Ruocco and Heri Ramampiaro
Dept. Of Computer and Information Science Norwegian University of Science and Technology
Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
2
Outline
Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
3
Outline
1. Introduction 1. Problem Statement 2. Related Works 3. Contributions
2. Proposed approach 1. Problem definition 2. Preliminary 3. Algorithm Overview
3. Evaluation 4. Conclusions
Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
4
Problem Statement
Event Detection
Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
5
Problem Statement
Event Detection
Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
- Event detection topic has its origin from the TDT (Topic Detection and Tracking) project(1):
(1) http://projects.ldc.upenn.edu/TDT/!
6
Problem Statement
Event Detection
Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
- Event detection topic has its origin from the TDT (Topic Detection and Tracking) project(1):
- Objective: aggregate stories over time into single event topic
(1) http://projects.ldc.upenn.edu/TDT/!
7
Problem Statement
Event Detection
Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
- Event detection topic has its origin from the TDT (Topic Detection and Tracking) project(1):
- Objective: aggregate stories over time into single event topic
(1) http://projects.ldc.upenn.edu/TDT/!
Something happening in a certain place at a certain time [Yang, Pierce, Carbonell 1999]
8
Problem Statement
Event Detection
Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
9
Problem Statement
Event Detection
Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
- Most previous works focus on time-tagged document streams can be classified as:
10
Problem Statement
Event Detection
Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
- Most previous works focus on time-tagged document streams can be classified as:
- Retrospective Detection : discover unidentified events in a collection of news [Yang et al. 1998]
11
Problem Statement
Event Detection
Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
- Most previous works focus on time-tagged document streams can be classified as:
- Retrospective Detection : discover unidentified events in a collection of news [Yang et al. 1998]
- Online Detection : detect events in real-time from a stream of news [Brants et al. 2003]
12
Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
Problem Statement
Web Photo-Sharing Apps – New Needs
13
Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
Huge Amount of Pictures
Problem Statement
Web Photo-Sharing Apps – New Needs
14
Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
Huge Amount of Pictures
Time!User!Location!Tags!
26 Oct 2010 RMax
26:12, 23:14 Roma, Sky, Bridge
…!
Problem Statement
Web Photo-Sharing Apps – New Needs
15
Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
Huge Amount of Pictures
Time!User!Location!Tags!
26 Oct 2010 RMax
26:12, 23:14 Roma, Sky, Bridge
…!
New Needs
Knowledge Extraction
Browse
Retrieve
Problem Statement
Web Photo-Sharing Apps – New Needs
16
Problem Statement
Challenges
Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
17
Problem Statement
Challenges
Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
- Event detection on Tagged Picture from Photo-Sharing Apps - Web-scale environment - Use of contextual information - Noisy annotation
18
Problem Statement
Challenges
Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
- Event detection on Tagged Picture from Photo-Sharing Apps - Web-scale environment - Use of contextual information - Noisy annotation
19
Related Works
Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
20
Related Works
Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
- Event Clustering (Visual/Temporal information) [Loui, Savakis 2002]
- Albuming user photo collections
- Not scalable to large dataset!
- Limited to user photo collection! - No Locational Information!
21
Related Works
Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
- Event Clustering (Visual/Temporal information) [Loui, Savakis 2002]
- Albuming user photo collections
- Not scalable to large dataset!
- Limited to user photo collection! - No Locational Information!
- Event/Place Semantic Identification (Temporal information) [Rattenbury et al. 2007]
- Extraction of event and place semantics for tags assigned to Flickr photos
- Scale-Structure Identification (SSI) method to analyze the tag usage distribution
- SSI is limited for large dataset!
- Location information is not considered!
22
Related Works
Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
- Event Clustering (Visual/Temporal information) [Loui, Savakis 2002]
- Albuming user photo collections
- Not scalable to large dataset!
- Limited to user photo collection! - No Locational Information!
- Event/Place Semantic Identification (Temporal information) [Rattenbury et al. 2007]
- Extraction of event and place semantics for tags assigned to Flickr photos
- Scale-Structure Identification (SSI) method to analyze the tag usage distribution
- SSI is limited for large dataset!
- Location information is not considered!
- Event Tag Detection (Spatial/Temporal information) [Chen, Roy 2009] - Detect event tags from Flickr photos
- As [Rattenbury et al. 2007] use SSI method to analyze the tag usage distribution
- SSI is used over locational and spatial distributions simultaneously
23
Problem Definition
Hypothesis
Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
24
Problem Definition
Hypothesis
Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
Something happening in a certain place at a certain time [Yang, Pierce, Carbonell 1999]
25
Problem Definition
Hypothesis
Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
Something happening in a certain place at a certain time [Yang, Pierce, Carbonell 1999]
Something happening in a certain place at a certain time with a certain tag
26
Problem Definition
Hypothesis
Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
Something happening in a certain place at a certain time [Yang, Pierce, Carbonell 1999]
Something happening in a certain place at a certain time with a certain tag
Event Cluster ej {tj=tj, dti=dtj, gi=gj, Ii,Ij ek }
27
Problem Definition
Hypothesis
Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
Something happening in a certain place at a certain time [Yang, Pierce, Carbonell 1999]
Something happening in a certain place at a certain time with a certain tag
Event Cluster ej {tj=tj, dti=dtj, gi=gj, Ii,Ij ek } Not the opposite !
28
Problem Definition
Hypothesis – Landmark clusters
Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
time
colosseo!g
Location Event Cluster ek
{tj=tj, dti=dtj, gi=gj, Ii,Ij ek }
…
29
Problem Definition
Hypothesis – Landmark clusters
Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
time
Location
colosseo!g
dt
Event Cluster ek
{tj=tj, dti=dtj, gi=gj, Ii,Ij ek }
…
30
Problem Definition
Hypothesis – Landmark clusters
Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
time
Location
colosseo!g
dt
Event Cluster ek {tj=tj, dti=dtj, gi=gj, Ii,Ij ek } Not the opposite !
Event Cluster ek
Landmark Clusters
{tj=tj, dti=dtj, gi=gj, Ii,Ij ek }
…
31
Problem Definition
Hypothesis – Event clusters
Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
time
Location
g
32
Problem Definition
Hypothesis – Event clusters
Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
time
Location
g
dt
33
Problem Definition
Hypothesis – Event clusters
Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
time
Location
g
dt
applepies!
34
Problem Definition
Hypothesis – Event clusters
Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
time
Location
g
dt
applepies!
Event Cluster ek {tj=tj, dti=dtj, gi=gj, Ii,Ij ek }
Event Cluster ek
Landmark Clusters
Event Clusters
{tj=tj, dti=dtj, gi=gj, Ii,Ij ek }
35
Problem Definition
Hypothesis – Event clusters
Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
time
Location
g
dt
applepies!
Event Cluster ek {tj=tj, dti=dtj, gi=gj, Ii,Ij ek } The opposite is true !
Event Cluster ek
Landmark Clusters
Event Clusters
{tj=tj, dti=dtj, gi=gj, Ii,Ij ek }
36
Problem Definition
Hypothesis – Event clusters
Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
time
Location
g applepies!
Event Cluster ek {tj=tj, dti=dtj, gi=gj, Ii,Ij ek } The opposite is true !
{tj=tj, dti=dtj, gi=gj, Ii,Ij ek }
Event Cluster ek
Landmark Clusters
Event Clusters
37
Problem Definition
New Formulation
Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
time
Location
g applepies!
time
g applepies!
dt
Event Cluster ek Event
Clusters
Location
=
Sdgt Sgt
38
Problem Definition
New Formulation
Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
time
Location
g applepies!
time
g applepies!
dt
Event Cluster ek { (dt, g, t) : Sdgt = Sgt} with (Sdgt = ek)
€
∃
Event Cluster ek Event
Clusters
{ (dt, g, t) : Sdgt = Sgt} with (Sdgt = ek)
€
∃
Location
=
Sdgt Sgt
39
Preliminary
Suffix-Tree Clustering [Zamir 1998]
Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
40
Preliminary
Suffix-Tree Clustering [Zamir 1998]
Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
- Suffix-Tree based
41
Preliminary
Suffix-Tree Clustering [Zamir 1998]
Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
- Suffix-Tree based - Mainly used in text (web) document clustering
42
Preliminary
Suffix-Tree Clustering [Zamir 1998]
Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
- Suffix-Tree based - Mainly used in text (web) document clustering - Three step process:
1 Document cleaning 2 Base clusters identification 3 Base clusters merging
43
Preliminary
Suffix-Tree Clustering [Zamir 1998]
Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
- Suffix-Tree based - Mainly used in text (web) document clustering - Three step process:
1 Document cleaning 2 Base clusters identification 3 Base clusters merging
- Incremental clustering
44
Preliminary
Suffix-Tree Clustering [Zamir 1998]
Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
- Suffix-Tree based - Mainly used in text (web) document clustering - Three step process:
1 Document cleaning 2 Base clusters identification 3 Base clusters merging
- Incremental clustering - Cluster label inferred by the tree structure
45
Preliminary
Suffix-Tree Clustering [Zamir 1998]
Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
- Suffix-Tree based - Mainly used in text (web) document clustering - Three step process:
1 Document cleaning 2 Base clusters identification 3 Base clusters merging
- Incremental clustering - Cluster label inferred by the tree structure - Phrase-Based model
46
Preliminary
Suffix-Tree Clustering [Zamir 1998]
Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
- Suffix-Tree based - Mainly used in text (web) document clustering - Three step process:
1 Document cleaning 2 Base clusters identification 3 Base clusters merging
- Incremental clustering - Cluster label inferred by the tree structure - Phrase-Based model - Snippet-tolerant
47
Preliminary
Suffix-Tree Clustering [Zamir 1998]
Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
- Suffix-Tree based - Mainly used in text (web) document clustering - Three step process:
1 Document cleaning 2 Base clusters identification 3 Base clusters merging
- Incremental clustering - Cluster label inferred by the tree structure - Phrase-Based model - Snippet-tolerant - Overlapped clusters
48
Preliminary
Suffix-Tree
Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
49
Preliminary
Suffix-Tree
Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
- Given a string S suffix-tree is a Compact Trie containing all the suffixes of S
- Rooted directed tree - Each internal node other than root has at least two children - Each edge leaving a particular node is labelled with a non-empty
substring of S
50
Preliminary
Suffix-Tree
Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
- Given a string S suffix-tree is a Compact Trie containing all the suffixes of S
- Rooted directed tree - Each internal node other than root has at least two children - Each edge leaving a particular node is labelled with a non-empty
substring of S
Papua ‘apua’ ‘pua’ ‘ua’ ‘a’
51
Preliminary
Suffix-Tree
Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
- Given a string S suffix-tree is a Compact Trie containing all the suffixes of S
- Rooted directed tree - Each internal node other than root has at least two children - Each edge leaving a particular node is labelled with a non-empty
substring of S
- Suffix-Tree construction performs in linear time (O(n)) ([Ukkonen 1995])
Papua ‘apua’ ‘pua’ ‘ua’ ‘a’
52
Algorithm Overview
Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
53
Algorithm Overview
Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
Suffix Tree Construction
Event clusters extraction
Event Clusters merge
Data cleaning Data extension
…
… Primary!Party!Election!Campaign!
… Concert!Music!John! …
Ii = (T, g, dt)
54
Algorithm Overview Data Cleaning and Extension
Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
55
Algorithm Overview Data Cleaning and Extension
Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
- Cleaning: Ii = (T,g,dt) Ii’ = (T’,g,dt) - Stopword removal (with extended vocabulary) + Stemming
56
Algorithm Overview Data Cleaning and Extension
Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
- Cleaning: Ii = (T,g,dt) Ii’ = (T’,g,dt) - Stopword removal (with extended vocabulary) + Stemming
- Extension: Ii’ = (T’,g,dt) Ii’’ = (T’’,g,dt) - Spatial and Temporal information are encoded in the annotation set T
57
Algorithm Overview Data Cleaning and Extension
Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
- Cleaning: Ii = (T,g,dt) Ii’ = (T’,g,dt) - Stopword removal (with extended vocabulary) + Stemming
- Extension: Ii’ = (T’,g,dt) Ii’’ = (T’’,g,dt) - Spatial and Temporal information are encoded in the annotation set T
T’’ = {t’’1, …, t’’l } t’’i = [s1(dt) + s2(g) + ti ]
where s1 and s2 encoding function from date/location to string
58
Algorithm Overview Data Cleaning and Extension
Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
- Cleaning: Ii = (T,g,dt) Ii’ = (T’,g,dt) - Stopword removal (with extended vocabulary) + Stemming
- Extension: Ii’ = (T’,g,dt) Ii’’ = (T’’,g,dt) - Spatial and Temporal information are encoded in the annotation set T
T’’ = {t’’1, …, t’’l } t’’i = [s1(dt) + s2(g) + ti ]
where s1 and s2 encoding function from date/location to string
s1 and s2 define the granularity in space (geographical grid) and time
59
Algorithm Overview Data Cleaning and Extension
Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
- Cleaning: Ii = (T,g,dt) Ii’ = (T’,g,dt) - Stopword removal (with extended vocabulary) + Stemming
- Extension: Ii’ = (T’,g,dt) Ii’’ = (T’’,g,dt) - Spatial and Temporal information are encoded in the annotation set T
T’’ = {t’’1, …, t’’l } t’’i = [s1(dt) + s2(g) + ti ]
where s1 and s2 encoding function from date/location to string
acmm2010 florence multimedia
26Oct2010 43.77:11.24 acmm2010 26Oct2010 43.77:11.24 florence 26Oct2010 43.77:11.24 multimedia
s1 and s2 define the granularity in space (geographical grid) and time
60
Algorithm Overview Data Cleaning and Extension
Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
- Cleaning: Ii = (T,g,dt) Ii’ = (T’,g,dt) - Stopword removal (with extended vocabulary) + Stemming
- Extension: Ii’ = (T’,g,dt) Ii’’ = (T’’,g,dt) - Spatial and Temporal information are encoded in the annotation set T
T’’ = {t’’1, …, t’’l } t’’i = [s1(dt) + s2(g) + ti ]
where s1 and s2 encoding function from date/location to string
acmm2010 florence multimedia
26Oct2010 43.77:11.24 acmm2010 26Oct2010 43.77:11.24 florence 26Oct2010 43.77:11.24 multimedia
s1 and s2 define the granularity in space (geographical grid) and time
s1(26/10/2010) s2(43.777864,11.249029)
T’ T’’
61
Algorithm Overview ST Construction and Event Extraction
Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
62
Algorithm Overview ST Construction and Event Extraction
Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
- Image Ii’’ : document snippet
Ψl
Ψ’l
Ii’’ = (T’’,g,dt) T’’ = {t’’1, …, t’’l } t’’i = [s1(dt) + s2(g) + ti ]
63
Algorithm Overview ST Construction and Event Extraction
Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
- Image Ii’’ : document snippet
- Extract Candidate event clusters Ψl : - Ψl ([s1(dt) + s2(g) + ti ])
Ψl
Ψ’l
Ii’’ = (T’’,g,dt) T’’ = {t’’1, …, t’’l } t’’i = [s1(dt) + s2(g) + ti ]
64
Algorithm Overview ST Construction and Event Extraction
Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
- Image Ii’’ : document snippet
- Extract Candidate event clusters Ψl : - Ψl ([s1(dt) + s2(g) + ti ])
Ψl
Ψ’l
Ii’’ = (T’’,g,dt) T’’ = {t’’1, …, t’’l } t’’i = [s1(dt) + s2(g) + ti ]
Event Cluster ek { (dt, g, t) : Sdgt = Sgt} with (Sdgt = ek)
€
∃
65
Algorithm Overview ST Construction and Event Extraction
Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
- Image Ii’’ : document snippet
- Extract Candidate event clusters Ψl : - Ψl ([s1(dt) + s2(g) + ti ])
- Extract Ψ’l ([s2(g) + ti ])
Ψl
Ψ’l
Ii’’ = (T’’,g,dt) T’’ = {t’’1, …, t’’l } t’’i = [s1(dt) + s2(g) + ti ]
Event Cluster ek { (dt, g, t) : Sdgt = Sgt} with (Sdgt = ek)
€
∃
66
Algorithm Overview ST Construction and Event Extraction
Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
- Image Ii’’ : document snippet
- Extract Candidate event clusters Ψl : - Ψl ([s1(dt) + s2(g) + ti ])
- Extract Ψ’l ([s2(g) + ti ]) - Compare Ψl and Ψ’l
Ψl
Ψ’l
Ii’’ = (T’’,g,dt) T’’ = {t’’1, …, t’’l } t’’i = [s1(dt) + s2(g) + ti ]
Event Cluster ek { (dt, g, t) : Sdgt = Sgt} with (Sdgt = ek)
€
∃
67
Algorithm Overview ST Construction and Event Extraction
Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
- Image Ii’’ : document snippet
- Extract Candidate event clusters Ψl : - Ψl ([s1(dt) + s2(g) + ti ])
- Extract Ψ’l ([s2(g) + ti ]) - Compare Ψl and Ψ’l - IF (Ψl = Ψ’l) Ψl ([s1(dt) + s2(g) + ti ]) is event cluster
Ψl
Ψ’l
Ii’’ = (T’’,g,dt) T’’ = {t’’1, …, t’’l } t’’i = [s1(dt) + s2(g) + ti ]
Event Cluster ek { (dt, g, t) : Sdgt = Sgt} with (Sdgt = ek)
€
∃
68
Algorithm Overview ST Construction and Event Extraction
Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
- Image Ii’’ : document snippet
- Extract Candidate event clusters Ψl : - Ψl ([s1(dt) + s2(g) + ti ])
- Extract Ψ’l ([s2(g) + ti ]) - Compare Ψl and Ψ’l - IF (Ψl = Ψ’l) Ψl ([s1(dt) + s2(g) + ti ]) is event cluster - Label inferred from the structure
Ψl
Ψ’l
Ii’’ = (T’’,g,dt) T’’ = {t’’1, …, t’’l } t’’i = [s1(dt) + s2(g) + ti ]
Event Cluster ek { (dt, g, t) : Sdgt = Sgt} with (Sdgt = ek)
€
∃
69
Algorithm Overview Extraction and Merge
Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
Ψl
Ψ’l
70
Algorithm Overview Extraction and Merge
Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
- Extracted event clusters : {e1, …,en}
Ψl
Ψ’l
71
Algorithm Overview Extraction and Merge
Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
- Extracted event clusters : {e1, …,en} - Merge semantically similar cluster:
Ψl
Ψ’l
72
Algorithm Overview Extraction and Merge
Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
- Extracted event clusters : {e1, …,en} - Merge semantically similar cluster:
Ψl
Ψ’l
€
θ(ei,e j ) =ei ∩ e jmin(ei,e j )
73
Evaluation - Dataset
Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
74
Evaluation - Dataset
Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
- Dataset collected from Flickr - Only geo-tagged picture - 12 June 2008 – 11 June 2010 (729 days) - San Francisco Area
#Images ~ 350K #Tags ~ 3M
75
Evaluation - Measure
Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
76
Evaluation - Measure
Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
- List of ranked Clusters: {e1, e2, …}
77
Evaluation - Measure
Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
- List of ranked Clusters: {e1, e2, …} - Ranking according to cluster's size: |ei|
78
Evaluation - Measure
Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
- List of ranked Clusters: {e1, e2, …} - Ranking according to cluster's size: |ei| - Drawback: lack of ground truth (recall measure)
79
Evaluation - Measure
Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
- List of ranked Clusters: {e1, e2, …} - Ranking according to cluster's size: |ei| - Drawback: lack of ground truth (recall measure)
Top-K Precision :
€
Rk
KRk : relevant clusters in the first k returned
80
Evaluation - Measure
Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
- List of ranked Clusters: {e1, e2, …} - Ranking according to cluster's size: |ei| - Drawback: lack of ground truth (recall measure)
Top-K Precision :
€
Rk
KRk : relevant clusters in the first k returned
Top-20 (K=20)
81
Evaluation
Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
- Experiment on different granularity in time and space - Time:
- Space: Latitude Precision Longitude
Precision Square Size
(Meters)
0.01 0.01 1000m X 1000m
0.005 0.005 500m X 500m
0.002 0.002 200m X 200m
0.001 0.001 100m X 100m
1 day 1 week
Example 2008Oct12 2008:43
82
Evaluation - Results
Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
100 m 200 m 500 m 1000 m
1 Day 1 Week 1 Day 1 Week 1 Day 1 Week 1 Day 1 Week
#Clusters #Ev. Prec. #Ev. Prec. #Ev. Prec. #Ev. Prec. #Ev. Prec. #Ev. Prec. #Ev. Prec. #Ev. Prec.
1 1 100% 1 100% 1 100% 1 100% 1 100% 1 100% 1 100% 1 100%
2 2 100% 2 100% 2 100% 2 100% 2 100% 2 100% 2 100% 1 50%
3 3 100% 3 100% 3 100% 3 100% 3 100% 3 100% 3 100% 2 67%
…
20 15 75% 14 70% 15 75% 14 70% 14 70% 13 65% 13 65% 14 70%
83
Evaluation - Results
Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
Top-
20 p
reci
sion
84
Conclusion
Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
85
Conclusion
Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
- Novel algorithm for event cluster extraction: - from large amount of Flickr images - Multi-user photo collection - Incremental clustering algorithm
86
Conclusion
Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
- Novel algorithm for event cluster extraction: - from large amount of Flickr images - Multi-user photo collection - Incremental clustering algorithm
- Extension of STC previously used only to cluster text documents
87
Conclusion
Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
- Novel algorithm for event cluster extraction: - from large amount of Flickr images - Multi-user photo collection - Incremental clustering algorithm
- Extension of STC previously used only to cluster text documents - Based on a Suffix-Tree (construction O(n))
88
Conclusion
Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
- Novel algorithm for event cluster extraction: - from large amount of Flickr images - Multi-user photo collection - Incremental clustering algorithm
- Extension of STC previously used only to cluster text documents - Based on a Suffix-Tree (construction O(n)) - Automatic annotation of clusters
89
Conclusion
Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
- Novel algorithm for event cluster extraction: - from large amount of Flickr images - Multi-user photo collection - Incremental clustering algorithm
- Extension of STC previously used only to cluster text documents - Based on a Suffix-Tree (construction O(n)) - Automatic annotation of clusters - Noise reduction in the tag using extended vocabulary for stopword
removal
90
Conclusion
Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
- Novel algorithm for event cluster extraction: - from large amount of Flickr images - Multi-user photo collection - Incremental clustering algorithm
- Extension of STC previously used only to cluster text documents - Based on a Suffix-Tree (construction O(n)) - Automatic annotation of clusters - Noise reduction in the tag using extended vocabulary for stopword
removal - Spatial and Time information considered
91
Conclusion
Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
- Novel algorithm for event cluster extraction: - from large amount of Flickr images - Multi-user photo collection - Incremental clustering algorithm
- Extension of STC previously used only to cluster text documents - Based on a Suffix-Tree (construction O(n)) - Automatic annotation of clusters - Noise reduction in the tag using extended vocabulary for stopword
removal - Spatial and Time information considered - Analysis of different granularity of time and space
92
Thanks ( ) for the attention!
Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
谢谢
http://www.idi.ntnu.no/~ruocco/
93
Thanks ( ) for the attention!
QUESTIONS?
Massimiliano Ruocco – Event Cluster Detection on Flickr Images using a Suffix-Tree Structure – IEEE ISM2010
谢谢
http://www.idi.ntnu.no/~ruocco/