Story Compression: Aggregating News...
Transcript of Story Compression: Aggregating News...
Story Compression: Aggregating News Feeds Joseph W. Barker
Advisor: James W. Davis Ohio State University
What is Story Compression? • News broadcasts from multiple sources tend to cover same stories • Stories have content overlap – General content covered by multiple sources – Specific content covered by one source
• Information gathering – Waste time if view all broadcasts (general content → redundancy) – Miss information if only view one broadcast (specific content)
• Answer: Story Compression – Detect general vs. specific content and create single story from all
broadcasts with no redundancy
Overview • Divide story into content segments (i.e., single idea) – Video shot (continuous scene) detection
• Compare segments – Speech/text contains most of the informational content – Word similarity → Segment Similarity
• Detect specific vs. general segments
Word Similarity
• Focus on concepts rather than specific word matching
• Graph-based hierarchy of word-concept relationships
– E.g., WordNet
• Malik et. al 2007
– 𝑠𝑖𝑚 𝑤1, 𝑤2 =2∙𝑑𝑖𝑠𝑡(𝑟𝑜𝑜𝑡,𝐿𝐶𝑆 𝑤1,𝑤2 )
𝑑𝑖𝑠𝑡 𝑟𝑜𝑜𝑡,𝑤1 +𝑑𝑖𝑠𝑡(𝑟𝑜𝑜𝑡,𝑤2)
• Li et. al 2003
– 𝑠𝑖𝑚 𝑤1, 𝑤2 =
𝑒−𝛼 𝑑𝑖𝑠𝑡 𝑤1,𝑤2 tanh (𝛽 𝑑𝑖𝑠𝑡 𝑟𝑜𝑜𝑡, 𝐿𝐶𝑆 𝑤1, 𝑤2 )
Feline
Mammal
Canine
Poodle
Object
Cat
Segment Similarity • Sentence similarity? – Segments range from sub-sentence to
multiple sentences – Also, sentence boundaries (when multiple)
poorly defined – Sentence similarity emphasizes
grammar/word order; won’t work
• If ordering is problematic, use unordered groups instead
• Solution: Graph collapsing – Group of nodes collapsed to single node by
summing edge weights – Inspired by spectral clustering and notion
of random walk on graphs – Random walk between groups equivalent
to random walk between collapsed nodes
Segment Similarity
Word Similarity
Most Unique Segments • Manual segmentation
employed • Specific content • Uniqueness → overall
dissimilarity • Perfect dissimilarity →
similarity matrix rows/columns zero except for diagonal
• Thus, sum of row/column should approach zero for most dissimilar segments
Most Related Segments • General content • Related → group self-
similar • Perfect self-similarity →
similarity matrix elements for group all one
• Thus, sum of elements should approach 𝑛2 (𝑛=number in group)
0 10 20 30 40 50 60 70 80 90 1003.3
3.35
3.4
3.45
3.5
3.55
3.6
3.65
3.7
3.75
3.8Segment Pair Similarity (higher is better)
Sim
ilarity
Segment pairs (sorted)
0 5 10 15 20 25 30 35 40 450.014
0.016
0.018
0.02
0.022
0.024
0.026
0.028
0.03
0.032
Segment Uniqueness (lower better)
Uniq
ueness
Segments (sorted)
Perfect dissimilarity Somewhat dissimilar
Perfect similarity Somewhat similar
Automatic Segment Detection • How to decide boundaries
between segments? – No sentence boundaries, so text
not strong indicator • Shot detection: Detect visual
change from one scene to another
• Common techniques: – Temporal extent
• Consecutive: compare sequential pairs of frames
• Key frame: compare to “key” frame of previous segment
– Distance measures • Pixel-based: Sum of Absolute
Differences (SAD), Sum of Squared Differences (SSD), Normalized Cross-Correlation (NCC)
• Color-based (histograms): χ2, Bhattacharyya
• Texture-based: Scale Invariant Feature Transform (SIFT)
Towards Improving Segment Detection • Common methods give mediocre
performance • May be due to only examining single
temporal extent • Possible solution: Use graph
collapsing to examine all temporal extents simultaneously
• Sum of blocks on diagonal approaches 𝑛2 if members in segment
• Sum of block anti-diagonal approaches zero if corner is segment boundary
• Current problem: Scale of valleys (boundaries) varies quadratically with segment size, simple peak finding not good enough
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9Shot Detection: Key Frame (First)
Normalized threshold (1 = perfect match)
F s
core
SAD
SSD
NCC
SIFT-MR
BATTA-H16
CHI2-H16
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9Shot Detection: Consecutive
Normalized threshold (1 = perfect match)
F s
core
SAD
SSD
NCC
SIFT-MR
BATTA-H16
CHI2-H16
Method F TP FP FN
SAD 0.747 0.596 0.081 0.322
SSD 0.746 0.595 0.044 0.362
NCC 0.770 0.626 0.009 0.365
BATTA-H16 0.779 0.638 0.125 0.237
CHI2-H16 0.210 0.117 0.005 0.878
0 2000 4000 6000 8000 10000 120000.85
0.9
0.95
1
1.05
1.1
1.15
1.2
1.25
Frame
Anti-diagonal Sum
Conclusion and Future Work • Graph collapsing can be used to derive group similarity from
similarity of group members • Additionally, can be used to evaluate uniqueness of objects,
relatedness of groups – Tested with text, working on video
• Future work – Finalize graph collapsing video segmentation – Expand word similarity to include multiple languages – Investigate sub-image feature extraction/matching – Examine other sources (e.g., YouTube)
“…declaring a public health emergency….”
“…declaring a public health emergency….”
ABC NBC
#1)
“…after the virus killed….” “…sadly had claimed 18 lives….”
NBC
CBS
#2)
“…declaring a public health emergency….”
“…to repeat, declared a public health emergency….”
ABC NBC
#3)
ABC
CBS
“…they’ve set up a special tent….”
“…a tent has been setup….”
#4)
“In Boston today, the mayor sounded the alarm”
ABC
#1)
“…moved onto the upper respiratory, which is a lot of coughing…”
ABC
#2)
“…stay home when you are sick…”
ABC
#3)
“…I’ve never been hit by a Mack truck…”
ABC
#4)
“…is on the panel that decides what goes in the vaccine…”
CBS
#5)
“…after confirmed cases of flu reach 700…”
CBS
#6)
Consecutive Shot Detection Across All Stories
Sho
t D
etec
tio
n o
n s
tory
FLU
Video similarity
Sum of diagonal blocks
Fram
e B
lock
Sta
rt
Block End
AB
C
CB
S
NB
C