Multimodal Alignment of Scholarly Documents and Their Presentations Bamdad Bahrani JCDL 2013...

19
Multimodal Alignment of Scholarly Documents and Their Presentations Bamdad Bahrani JCDL 2013 Submission Feb 2013

Transcript of Multimodal Alignment of Scholarly Documents and Their Presentations Bamdad Bahrani JCDL 2013...

Page 1: Multimodal Alignment of Scholarly Documents and Their Presentations Bamdad Bahrani JCDL 2013 Submission Feb 2013.

Multimodal Alignment of Scholarly Documents and

Their Presentations

Bamdad Bahrani

JCDL 2013 Submission

Feb 2013

Page 2: Multimodal Alignment of Scholarly Documents and Their Presentations Bamdad Bahrani JCDL 2013 Submission Feb 2013.

2

Motivation

0How many papers do you read every week?0How many you read deeply?0How many you just skim?

0Title, abstract and conclusion Enough?

0A summary of the paper Most important issues

Introduction Analysis Method Experiment & Result Conclusion

Page 3: Multimodal Alignment of Scholarly Documents and Their Presentations Bamdad Bahrani JCDL 2013 Submission Feb 2013.

3

Motivation

0Slide Presentation as a summary0 It includes important contents from paper0 It is made by the same author

0But0 Not detailed enough0 Misses some technical parts of the paper

Introduction Analysis Method Experiment & Result Conclusion

Page 4: Multimodal Alignment of Scholarly Documents and Their Presentations Bamdad Bahrani JCDL 2013 Submission Feb 2013.

4

Introduction

0The Paper 0and its Slide Presentation

0Alignment map

Introduction Analysis Method Experiment & Result Conclusion

Page 5: Multimodal Alignment of Scholarly Documents and Their Presentations Bamdad Bahrani JCDL 2013 Submission Feb 2013.

5

Previous Works0 Hayama et al.

0 20050 Japanese technical papers and presentation sheets0 Using HMM

0 Kan0 20070 SlideSeer0 Crawling of paper-presentation pairs, aligning them and GUI

0 Beamer and Girju0 20090 Detailed analysis of different similarity measures

Introduction Analysis Method Experiment & Result Conclusion

Only Textual Content

Page 6: Multimodal Alignment of Scholarly Documents and Their Presentations Bamdad Bahrani JCDL 2013 Submission Feb 2013.

6

Slide Analysis

Nil17%

Outline5%

Image12%

Drawing9%

Table1%

Other56%

Slide Types

Introduction Analysis Method Experiment & Result Conclusion

Page 7: Multimodal Alignment of Scholarly Documents and Their Presentations Bamdad Bahrani JCDL 2013 Submission Feb 2013.

7

Error Analysis

Slide Type Incorrectly aligned in baseline

Common reason

Nil 64% Doesn’t know where to align align to best fit

Outline 36% Name of some sections in it align to longest one

Image 81% Very little text available

Drawing 53% Noisy data: lots of shapes and text boxes

Table 50% Little text, noisy data

Around 70% are showing “Evaluation and Result”

Introduction Analysis Method Experiment & Result Conclusion

Page 8: Multimodal Alignment of Scholarly Documents and Their Presentations Bamdad Bahrani JCDL 2013 Submission Feb 2013.

8

Alignment Modals0Text Similarity

0 Between each slide and each section0 The core aligner unit0 The baseline0 A cosine similarity measure: TF . IDF

0Linear Ordering0 Ordering between slides and sections are monotonic

0Visual appearance of slides

Motivation Analysis Method Experiment & Result Discussion

Page 9: Multimodal Alignment of Scholarly Documents and Their Presentations Bamdad Bahrani JCDL 2013 Submission Feb 2013.

9

Text Extraction Unit

0Presentation

0Paper

MS PowerPoint VB compiler

Slides

1. Slide Title text

2. Slide Body text

3. Slide Number

PDFxPDF Parser

(via Python)XML 1. Section Title

2. Section Body

Introduction Analysis Method Experiment & Result Conclusion

Page 10: Multimodal Alignment of Scholarly Documents and Their Presentations Bamdad Bahrani JCDL 2013 Submission Feb 2013.

10

Slide Image Classifier Unit

Take Snapshot

Slides

1. Text

2. Outline

3. Drawing

4. Results

Image Classifier

Image

Introduction Analysis Method Experiment & Result Conclusion

Page 11: Multimodal Alignment of Scholarly Documents and Their Presentations Bamdad Bahrani JCDL 2013 Submission Feb 2013.

11

Image Class Instructions01. Text

0 Text similarity alignment weight Increase 2/3

02. Outline0 Text similarity alignment weight Decrease 1/30 Linear ordering alignment weight Decrease 1/3

03. Drawing0 Uniform probability for all weights

04. Result0 Exceptional rule: Align directly to “Experiment and

Result” section

Introduction Analysis Method Experiment & Result Conclusion

Page 12: Multimodal Alignment of Scholarly Documents and Their Presentations Bamdad Bahrani JCDL 2013 Submission Feb 2013.

12

Image Classifier experiment and result

0750 Manually annotated slides0Linear SVM

0 Feature extraction: Histogram of Oriented Gradiants0 Blurring filters0 Normalization

010 fold cross validation

Image Class Text Outline Drawing Result Average

Correctly Classified

86% 95% 83% 84% 87.2%

Introduction Analysis Method Experiment & Result Conclusion

Page 13: Multimodal Alignment of Scholarly Documents and Their Presentations Bamdad Bahrani JCDL 2013 Submission Feb 2013.

13

Experiments

0Experiment 1:0 Baseline0 Paragraph-to-slide alignment0 Only textual data

0Experiment 2:0 Section-to-slide alignment0 Only textual data

Introduction Analysis Method Experiment & Result Conclusion

Page 14: Multimodal Alignment of Scholarly Documents and Their Presentations Bamdad Bahrani JCDL 2013 Submission Feb 2013.

14

Experiments

0Experiment 3:0 The effect of Linear Ordering alignment was added.0 Textual data and ordering information

0Experiment 4:0 The effect of Image Classification was added.0 Textual data, ordering information and visual content

Introduction Analysis Method Experiment & Result Conclusion

Page 15: Multimodal Alignment of Scholarly Documents and Their Presentations Bamdad Bahrani JCDL 2013 Submission Feb 2013.

Results

SlideSeer

Beamer 1

Exp 1 Exp 2 Exp 3 Beamer 2

Exp 4

Accuracy 41.2 50 52.1 60.7 66.8 75 77.3

42.5

47.5

52.5

57.5

62.5

67.5

72.5

77.5

Acc

ura

cy

SlideSeer

Beamer 1

Exp 1 Exp 2 Exp 3 Beamer 2

Exp 4

Accuracy 41.2 50 52.1 60.7 66.8 75 77.3

42.5

47.5

52.5

57.5

62.5

67.5

72.5

77.5

Acc

ura

cy

Baseline Section Ordering Image Class

Introduction Analysis Method Experiment & Result Conclusion

15

25%

Page 16: Multimodal Alignment of Scholarly Documents and Their Presentations Bamdad Bahrani JCDL 2013 Submission Feb 2013.

16

Conclusion

0Many slides with images and drawings

0Textual data is not enough

0Taking advantage of graphical features of slides

Introduction Analysis Method Experiment & Result Conclusion

Page 17: Multimodal Alignment of Scholarly Documents and Their Presentations Bamdad Bahrani JCDL 2013 Submission Feb 2013.

17

Future Tasks

0Bigger dataset

0More efficient text similarity measures

0Differentiate between Title and Body text weights

0Support more input file format

0A GUI to view aligned documents

Introduction Analysis Method Experiment & Result Conclusion

Page 18: Multimodal Alignment of Scholarly Documents and Their Presentations Bamdad Bahrani JCDL 2013 Submission Feb 2013.

18

Thank you…!

Introduction Analysis Method Experiment & Result Conclusion

Page 19: Multimodal Alignment of Scholarly Documents and Their Presentations Bamdad Bahrani JCDL 2013 Submission Feb 2013.

19

System Architcture

Input: Presentation

Text Extraction

Textual Similarity

Input: Document

nil

Linear Ordering

1. Text 3. Drawing

2. Index 4. Results

Multimodal Fusion

Slide Image Classifier

Output: Alignment