Mediaglobe - Medienanalyseverfahren

40

Transcript of Mediaglobe - Medienanalyseverfahren

Page 1: Mediaglobe - Medienanalyseverfahren
Page 2: Mediaglobe - Medienanalyseverfahren

Dr. Harald SackHasso-Plattner-Institut for IT-Systems Engineering

University of Potsdam

Medienanalyse-­‐verfahren

Page 3: Mediaglobe - Medienanalyseverfahren

• Das HPI wurde im Oktober 1998 als Public-Private-Partnership gegründet

• HPI Forschung und Lehre konzentriert sichauf IT Systems Engineering

• 10 Professoren & 100 wissenschaftl. Mitarbeiter

• 450 Bachelor / Master Studenten

• HPI is winner of CHE-Ranking 2010

http://hpi.uni-potsdam.de/

Hasso Plattner Institute für SoftwaresystemtechnikUniversität Potsdam

Page 4: Mediaglobe - Medienanalyseverfahren

• Forschungsthemen• Semantic Web Technologies

• Ontological Engineering

• Information Retrieval

• Multimedia Analysis & Retrieval

• Social Networking

• Data/Information Visualization

• Forschungsprojekte:

Hasso Plattner Institute für SoftwaresystemtechnikSemantic Technologies & Multimedia Retrieval Research Group

Page 5: Mediaglobe - Medienanalyseverfahren

Semantic Search Engine

Media Analysis‣Structural Video Analysis‣Intelligent Character Recognition‣Face Detection & Clustering ‣Audio Mining‣Visual Concept Detection

Semantic Analysis‣Named Entity Recognition‣Context Analysis‣Semantic Annotation

konzep3oneller  Workflow

Graphical User Interface‣Facetted Search‣Explorative Search‣fine granular User Annotation

Distribution / Production‣Media Asset Management

Digitization | Metadata | Rights

Page 6: Mediaglobe - Medienanalyseverfahren

Medienanalyseverfahren

‣Structural Video Analysis‣Intelligent Character Recognition‣Face Detection & Clustering ‣Audio Mining‣Visual Concept Detection

Page 7: Mediaglobe - Medienanalyseverfahren

Medienanalyseverfahren

‣Structural Video Analysis‣Intelligent Character Recognition‣Visual Concept Detection‣Face Detection & Clustering ‣Audio Mining

Page 8: Mediaglobe - Medienanalyseverfahren

scenes

shots

subshots

frames

video

Structural  Video  Analysis

‣Zerlegung der AV-Daten in Medienfragmente unter Berücksichtigung inhaltlicher Kohärenz

Page 9: Mediaglobe - Medienanalyseverfahren

key frames

scenes

shots

subshots

frames

video

Structural  Video  Analysis

‣Zerlegung der AV-Daten in Medienfragmente unter Berücksichtigung inhaltlicher Kohärenz

Page 10: Mediaglobe - Medienanalyseverfahren

Shot  Boundary  Detec3on

‣Automatische Identifikation von‣Hard Cuts‣Defects, z.B. Drop Outs, White Outs, etc.‣Soft Cuts, z.B., Fade-In/Out, Dissolve, Wipe, Cross-Fade, etc.

‣Analytische Shot Boundary Detection‣Basierend auf Luminanz-, Chrominanz-, und Kantenverteilung in Verbindung mit weiteren charakteristischen Bildeigenschaften‣Adaptive Schwellwertberechnung

‣Shot Boundary Detection mit maschinellen Lernverfahren‣Support Vector Machines‣Random Forrests Classifier

time

Page 11: Mediaglobe - Medienanalyseverfahren

Shot  Boundary  Detec3on

‣ Identifikation von Hard Cuts basierend auf‣Luminanz, Chrominanz und Derivative‣Kantenverteilung, Kantendichte

576 577 578575574573

Page 12: Mediaglobe - Medienanalyseverfahren

Shot  Boundary  Detec3on

Hardcut: if and is true for all Subregions a

i i+1 i+2i-1i-2i-3

1 2

3 4

tha(i) = α ·

i+W−1�

k=i−W

Da(k, k − 1)

−Da(i, i− 1)

+ β

Da(i, i− 1) > thα(i)

Da(i+ 1, i) < thα(i)

1

Window Size=4 (W=2)

Decompose Frame into a=4 Subregions

Da(i,i-1) ... Histogram Difference (L2-norm) between Frames i and i-1 of Subregion a

tha(i) ... adaptive Threshold for Frame i of Subregion a

Adaptive Threshold

tha(i) = α ·

i+W−1�

k=i−W

Da(k, k − 1)

−Da(i, i− 1)

+ β

Da(i, i− 1) > thα(i)

Da(i+ 1, i) < thα(i)

1

tha(i) = α ·

i+W−1�

k=i−W

Da(k, k − 1)

−Da(i, i− 1)

+ β

Da(i, i− 1) > thα(i)

Da(i+ 1, i) < thα(i)

1

Page 13: Mediaglobe - Medienanalyseverfahren

Shot  Boundary  Detec3on

Drop Out

Histogram/Chrominance Difference Analysis

Flashlight / White Out

Histogram/Chrominance Difference Analysis

i i+10i+9i+8 i+11 i+12 i+13i+1

‣Identifikation und Differenzierung von Defekten

Page 14: Mediaglobe - Medienanalyseverfahren

Shot  Boundary  Detec3on

Drop Out

Histogram/Chrominance Difference Analysis

Flashlight / White Out

Histogram/Chrominance Difference Analysis

i i+10i+9i+8 i+11 i+12 i+13i+1

‣Identifikation und Differenzierung von Defekten

Page 15: Mediaglobe - Medienanalyseverfahren

Shot  Boundary  Detec3on

‣Identifikation von Soft Cuts, z.B. FadeIn/FadeOut

‣verwendete Bildeigenschaften:‣Luminanzhistogramme ‣Entropieverlauf‣Bewegungsvektoren

Page 16: Mediaglobe - Medienanalyseverfahren

Shot  Boundary  Detec3on

‣Identifikation von Soft Cuts, z.B. FadeIn/FadeOut

‣verwendete Bildeigenschaften:‣Luminanzhistogramme ‣Entropieverlauf‣Bewegungsvektoren

1 2

3 4

Page 17: Mediaglobe - Medienanalyseverfahren
Page 18: Mediaglobe - Medienanalyseverfahren

Medienanalyseverfahren

‣Structural Video Analysis‣Intelligent Character Recognition‣Visual Concept Detection‣Face Detection & Clustering ‣Audio Mining

Page 19: Mediaglobe - Medienanalyseverfahren

Fig. 1. Workflow of the proposed text detection method. (b) is the vertical edge map of (a). (c) is the vertical dilation map of(b). (d) is the binary map of (c). (e) the result map of subsequent connected component analysis. (f) shows the binary map afterthe adaptive projection profile refinement. (g) is the final detection result.

for text detection of nature scene images. The operator com-putes for each pixel the width of the most likely stroke con-taining the pixel. The output of the operator is a stroke-featuremap, which has the same size as the input image, while eachpixel represents the corresponding stroke width value of theinput image.

3. TEXT DETECTION IN VIDEO IMAGES

Text detection is the first task of video OCR. Our approachdetermines, whether a single frame of a video file containstext lines, for which a tight bounding box is returned. In or-der to manage detected text lines efficiently, we have defined aclass ”text line object” with the following properties: bound-ing box location (the top-left corner position), bounding boxsize. After the first round of text detection, the refinement andthe verification procedures ensure the validity of the detectionresults in order to reduce false alarms.

3.1. Text detector

Before performing the text detection process, a gaussiansmooth filter is applied to the images that have an entropyvalue larger than a predefined threshold Tentr . For our pur-pose, Tentr =5.25 has proven to be to the best advantage.

We have developed an edge based text detector, subse-quently referred to edge text detector. The advantage of ourdetector is its computational efficiency compared to other ma-chine learning based approaches, because no computation-ally expensive training period is required. However, for vi-sually different video sequences a parameter adaption has tobe performed. The best suited parameter combination of ourmethod were learned from the test runs on the given test data.

Fig. 2. Workflow of the proposed adaptive text line refinementprocedure

The processing workflow for a single frame is depictedin Fig. 1 (a-e). First, a vertical edge map is produced usingSobel filter [8] (cf. Fig. 1 (b)). Then, the morphological dila-tion operation is adopted to link the vertical character edgestogether (cf. Fig. 1 (c)). Let MinW denote the detected min-imal text line width. A rectangle kernel:1×MinW is definedfor vertical dilation operator. Subsequently, a binary maskis generated by using Otsu’s thresholding method [9]. Ulti-mately, we create a binary map after Connected Component

Intelligent  Character  Recogni3on

‣Video OCR ist im Vergleich zur traditionellen Print OCR eineanspruchsvolle Aufgabe‣heterogener/niedriger Kontrast‣schlechte Lichtverhältnisse‣verzerrter und verdeckter Text‣Kompressionsartefakte‣etc.

Page 20: Mediaglobe - Medienanalyseverfahren

‣Preprocessing‣Character Identification‣Text Preprocessing‣Text Filtering‣Adaption of script geometry (Deskew)‣Image Quality Enhancement

‣Optical Character Recognition (OCR)‣Standard OCR software (OCRopus)

‣Postprocessing‣Lexical analysis ‣Statistical / context based filtering

Intelligent  Character  Recogni3on

Rostock

Text Filtering

Image QualityEnhancement

OCR

Page 21: Mediaglobe - Medienanalyseverfahren

‣Character Identification‣Robuste Filtermethoden zur effizienten Extraktion von Text-Kandidaten

‣25 fps resultiert in 90.000 Einzelbildern pro 60 Minuten‣zu aufwändig für eine vollständige Filterung & OCR aller Einzelbilder

Intelligent  Character  Recogni3on

TTTTT T TT T T

Frame Frame with CandidateTextboxes

Analytical Character Identification• Edge Based Detection

• DCT / Fourier Transformation• Sobel-/Canny Edge Filter• Histogram of Oriented Gradients• Constant Gradient Variance

• Texture Based Detection• Local Binary Patterns )• Spatial Variance

Region Based Detection• Connected Component Analysis • Stroke Width Analysis

Page 22: Mediaglobe - Medienanalyseverfahren

‣Analytical Textbox Filtering‣Horizontal & Vertical Projection Profile‣Stroke Width Analysis Based Verification

Intelligent  Character  Recogni3on

Page 23: Mediaglobe - Medienanalyseverfahren

‣Analytical Textbox Filtering‣Horizontal & Vertical Projection Profile‣Stroke Width Analysis Based Verification

Intelligent  Character  Recogni3on

Frame with VerifiedTextboxes

Frame with CandidateTextboxes

Page 24: Mediaglobe - Medienanalyseverfahren

Intelligent  Character  Recogni3on

‣Analytical Edge Based Character Identification

Page 25: Mediaglobe - Medienanalyseverfahren

Intelligent  Character  Recogni3on

‣Analytical Edge Based Character Identification

Page 26: Mediaglobe - Medienanalyseverfahren

Intelligent  Character  Recogni3on

‣Analytical Edge Based Character Identification

Page 27: Mediaglobe - Medienanalyseverfahren

Intelligent  Character  Recogni3on

‣Character Binarization & Normalization

Original Video

Frames

Page 28: Mediaglobe - Medienanalyseverfahren

Intelligent  Character  Recogni3on

‣Character Binarization & Normalization

Original Video

Frames

TextboxQuality

Enhancement

Page 29: Mediaglobe - Medienanalyseverfahren

Intelligent  Character  Recogni3on

‣Character Binarization & Normalization

Original Video

Frames

TextboxQuality

Enhancement

TextboxNormalization

andBinarization

Page 30: Mediaglobe - Medienanalyseverfahren

Intelligent  Character  Recogni3on

‣Standard Optical Character Recognition‣OCRopus 0.4.4 (Open Source, Apache License v2.0)‣Tesseract 3.01 (Open Source, Apache License v2.0)

Quality EnhancedNormalized Textboxes

Ueutsche Bank

Weubrandenburg

Raw OCR Results

Page 31: Mediaglobe - Medienanalyseverfahren

Intelligent  Character  Recogni3on

‣OCR Post Processing‣OCR-adapted Spell Correction (hunspell 1.3.2, Open Source GNU lGPL)‣Kontextbasierte Spell Correction (siehe kontextbasierte Named Entity Recognition, AP 4.1.5)

Deutsche Bank

Neubrandenburg

OCR Results after Spell Correction

Ueutsche Bank

Weubrandenburg

Raw OCR ResultsOCR-adapted

hunspell

Page 32: Mediaglobe - Medienanalyseverfahren
Page 33: Mediaglobe - Medienanalyseverfahren

Medienanalyseverfahren

‣Structural Video Analysis‣Intelligent Character Recognition‣Visual Concept Detection‣Face Detection & Clustering ‣Audio Mining

Page 34: Mediaglobe - Medienanalyseverfahren

Visual  Concept  Detec3on

‣Adaption des ,Bag of Words‘ Ansatzes aus dem Textretrieval‣Dictionary/Codeword Vocabulary‣Sätze werden als Vektoren über Dictionary dargestellt

Page 35: Mediaglobe - Medienanalyseverfahren

Visual  Concept  Detec3on

‣Adaption des ,Bag of Words‘ Ansatzes aus dem Textretrieval‣Dictionary/Codeword Vocabulary‣Sätze werden als Vektoren über Dictionary dargestellt

‣Diskretisierung eines Einzelbildes mit Hilfe der Codewörter

Page 36: Mediaglobe - Medienanalyseverfahren

Visual  Concept  Detec3on

‣Adaption des ,Bag of Words‘ Ansatzes aus dem Textretrieval‣Dictionary/Codeword Vocabulary‣Sätze werden als Vektoren über Dictionary dargestellt

‣Diskretisierung eines Einzelbildes mit Hilfe der Codewörter‣Repräsentiere Einzelbild als Histogramm der 4000 Codewortfrequenzen

‣Konzeptzuordnung durch maschinelles Lernverfahren (hier Support Vector Machines)

Page 37: Mediaglobe - Medienanalyseverfahren
Page 38: Mediaglobe - Medienanalyseverfahren

Medienanalyseverfahren

‣Structural Video Analysis‣Intelligent Character Recognition‣Visual Concept Detection‣Face Detection & Clustering ‣Audio Mining

Page 39: Mediaglobe - Medienanalyseverfahren

Semantic Search Engine

Media Analysis‣Structural Video Analysis‣Intelligent Character Recognition‣Face Detection & Clustering ‣Audio Mining‣Visual Concept Detection

Semantic Analysis‣Named Entity Recognition‣Context Analysis‣Semantic Annotation

konzep3oneller  Workflow

Graphical User Interface‣Facetted Search‣Explorative Search‣fine granular User Annotation

Distribution / Production‣Media Asset Management

Digitization | Metadata | Rights

Page 40: Mediaglobe - Medienanalyseverfahren

Contact:Dr. Harald SackHasso-Plattner-Institut für SoftwaresystemtechnikUniversität PotsdamProf.-Dr.-Helmert-Str. 2-3D-14482 Potsdam

Homepage: http://www.hpi.uni-potsdam.de/meinel/team/sack.html

Blog: http://moresemantic.blogspot.com/

E-Mail: [email protected]

Twitter: @lysander07 / @biblionomicon / @yovisto