Understanding The Semantics of Media Chapter 8 Camilo A. Celis.

41
Understanding The Semantics of Media Chapter 8 Camilo A. Celis

Transcript of Understanding The Semantics of Media Chapter 8 Camilo A. Celis.

Page 1: Understanding The Semantics of Media Chapter 8 Camilo A. Celis.

Understanding The Semantics of Media

Chapter 8

Camilo A. Celis

Page 2: Understanding The Semantics of Media Chapter 8 Camilo A. Celis.

Questions

1. What kind of application does SVD has? How is it used in this paper?

2. What does MPESAR stands for? What does this system do?

3. How does MPESAR generally works?

Page 3: Understanding The Semantics of Media Chapter 8 Camilo A. Celis.

Contents

Understanding the problem

Analysis Tools

Segmenting Video

Semantic Retrieval

Page 4: Understanding The Semantics of Media Chapter 8 Camilo A. Celis.

Contents

Understanding the problem Different Approaches Segmentation Literature Semantic Retrieval Literature

Analysis Tools

Segmenting Video

Semantic Retrieval

Page 5: Understanding The Semantics of Media Chapter 8 Camilo A. Celis.

Understanding the problem

Semantic: (N) the study of meaning. the study of linguistic development by classifying and

examining changes in meaning and form.changes in meaning and form.

Rapid growth of media: personal media, social media... Low price Social preassure

We are not understanding media.

Page 6: Understanding The Semantics of Media Chapter 8 Camilo A. Celis.
Page 7: Understanding The Semantics of Media Chapter 8 Camilo A. Celis.

Different Approaches

Increasing number of methods to retrieve information from media

[Aner and Kender] Finds a background in a video shot, and then clusters shorts into physical scenes by noting shots with common background.

QBIC (IMB) Allow to search for images based on the colors and images in an image. Known as query by-example.

Where is the semantics of the media?

"The most important information is in the WORDS!"

Page 8: Understanding The Semantics of Media Chapter 8 Camilo A. Celis.

Segmentation Literature

Extension of others work. Latent semantic indexing. (LSI) - Allows to summarize the

semantic content of a document and measure similarities.

Visualization and segmentation algorithm based on wavelet analysis of text documents. (time and frequency)

Scaled-space ideas to segmentation problem. Multi-dimensional signals.

Page 9: Understanding The Semantics of Media Chapter 8 Camilo A. Celis.

Semantic Retrieval Literature

Multimedia retrieval systems. (audio and video)

Mixtures of probability expert for semantic-audio retrieval (MPESAR) is a sophisticated model connection words and media.

- Consider the acoustic and semantics similarity of sounds, allowing user to retrieve sounds without searching on the an exact word.

"MPESAR algorithm is appropriate for mapping one type of media to another."

Page 10: Understanding The Semantics of Media Chapter 8 Camilo A. Celis.
Page 11: Understanding The Semantics of Media Chapter 8 Camilo A. Celis.

Contents

Understanding the problem

Analysis Tools

Segmenting Video

Semantic Retrieval

Page 12: Understanding The Semantics of Media Chapter 8 Camilo A. Celis.

Contents

Understanding the problem

Analysis Tools SVD Principles Color Space Word Space

Segmenting Video

Semantic Retrieval

Page 13: Understanding The Semantics of Media Chapter 8 Camilo A. Celis.

Analysis Tools

Common tools and mathematics used to analyze multimedia signals.

Two type of transformations, which reduce raw text and video signals into meaningful spaces.

Preprocessing the data

* Mapping from a one dimensional signal (speech) into a multidimensional signal (video).

Page 14: Understanding The Semantics of Media Chapter 8 Camilo A. Celis.

Analysis Tools: SVD

SVD (Singular Value Decomposition) principles: Factorization of real or complex matrixes.

Noise reduction.

Semantic and video data are expressed as vector-value function of time.

Collect data from an entire video and put the data into a matrix X. (Columns of X represent the signal at different times)

Using SVD, rewrite the matrix X in terms of 3 matrices U,S,V.

Page 15: Understanding The Semantics of Media Chapter 8 Camilo A. Celis.

Analysis Tools: Color Space

Color changes are useful metrics for finding the boundary between shots.

Collect a histogram of colors of each frame. (512 histograms bins)

Convert all the tree intensities RGB intensities (0-255) to a single histogram bin, by finding the log base 2, of the intensity value

Pack the tree colors into a 9-bit number using floor() to covert to an integer.

Page 16: Understanding The Semantics of Media Chapter 8 Camilo A. Celis.

Analysis Tools: Word Space Latent Semantic indexing (LSI), uses a SVD in direct

analogy to the color analysis.

Analyse the audio data by collection a histogram of the words in a transcript of the video. Only one document to study.

Consider sentences of the document, which define a semantic space.

Issues? Synonomous and Polysemy.

SVD captures both relationships.

Page 17: Understanding The Semantics of Media Chapter 8 Camilo A. Celis.

Contents

Understanding the problem

Analysis Tools

Segmenting Video

Semantic Retrieval

Page 18: Understanding The Semantics of Media Chapter 8 Camilo A. Celis.

Contents

Understanding the problem

Analysis Tools

Segmenting Video Temporal Properties Video Segmentation overview Scalar Space Combined Image and Audio Data Hierarchical Segmentation Results

Semantic Retrieval

Page 19: Understanding The Semantics of Media Chapter 8 Camilo A. Celis.

Segmenting Video Indexing by combining two major sources of

data images words

Describe the semantic path of a vide's transcript as a signal, from the initial sentence to the conclusion.

Instead of trying to find similarities (segments) see audio-visual content as a signal and look for large changes in this signal.

Page 20: Understanding The Semantics of Media Chapter 8 Camilo A. Celis.

Scale Space

Used to find boundaries in a signal.

Analyse a signal with many different kernels that vary in size of the temporal neighborhood that is included in the analysis at each point in time.

Look for changes in the signal over time. (Do so by calculating the derivate of the signal with respect to time)

Page 21: Understanding The Semantics of Media Chapter 8 Camilo A. Celis.

Overall

From hierarchial segmentation and compare it with other forms of segmentation.

A simple description of a video is possible by unifying the representations.

Combine 2 well known technique to find boundaries in a video. Reduce dimensionality (SVD) and put all in the same format and its application on color and word data.

Page 22: Understanding The Semantics of Media Chapter 8 Camilo A. Celis.

Combining color, words and scale space

The result is a 20-dimensional vector function of time and scale.

Page 23: Understanding The Semantics of Media Chapter 8 Camilo A. Celis.

Scale Space representations:

Page 24: Understanding The Semantics of Media Chapter 8 Camilo A. Celis.

Scale Space

Page 25: Understanding The Semantics of Media Chapter 8 Camilo A. Celis.

Results: Autocorrelation

Page 26: Understanding The Semantics of Media Chapter 8 Camilo A. Celis.

Results: Grouping correlation

Page 27: Understanding The Semantics of Media Chapter 8 Camilo A. Celis.

Results(cont.) Representations of the semantic information in the HeadlineNews video in scale space.

The top image shows the cosine of the angular change of the semantic trajectory with different amounts of low-pass filtering.

The middle plot shows the peaks of the scale-space derivative

The bottom plot shows the peaks traced backto their original starting point. These peaks represent topic boundaries.

Page 28: Understanding The Semantics of Media Chapter 8 Camilo A. Celis.

Results: Shot Boundary Segmentation

Page 29: Understanding The Semantics of Media Chapter 8 Camilo A. Celis.

Results:

Page 30: Understanding The Semantics of Media Chapter 8 Camilo A. Celis.

Segmentation in Perspective

New framework for combining into a unified representation and for segmentation from multiple types of information from a video.

Described hierarchial segmentation

(Unexactedly) good amount of information in the color.

This method is also applicable with other type of information. (musical key, audio emotion, etc)

Page 31: Understanding The Semantics of Media Chapter 8 Camilo A. Celis.

Contents

Understanding the problem

Analysis Tools

Segmenting Video

Semantic Retrieval

Page 32: Understanding The Semantics of Media Chapter 8 Camilo A. Celis.

Contents

Understanding the problem

Analysis Tools

Segmenting Video

Semantic Retrieval The algorithm Testing Conclusions

Page 33: Understanding The Semantics of Media Chapter 8 Camilo A. Celis.

Semantic Retrieval: MEPSAR

Connecting sounds to words and vice-versa. Queries with sounds and words

Learn about the connections between semantic space and acoustic space.

Page 34: Understanding The Semantics of Media Chapter 8 Camilo A. Celis.

Algorithm Semantic Features

Uses PORTER stemmer to remove common suffixes from words, and deletes common words before further processing.

Partition the space into overlapping clusters of regions. Acoustic Features

Signal processing and machine learning calculations endeavors to capture the sound.

MFCC(mel-frequency cepstral coefficient) Analyse speech sounds. Used to reduce the audio signal

GMM captures the long-term characteristics of each sound.

Page 35: Understanding The Semantics of Media Chapter 8 Camilo A. Celis.

Semantic Retrieval

Acoustic signal processing chain

Building MPESAR models

Page 36: Understanding The Semantics of Media Chapter 8 Camilo A. Celis.

Testing

Audio to semantic testing procedure.

Page 37: Understanding The Semantics of Media Chapter 8 Camilo A. Celis.

Retrieval Results

Histogram of true label ranks based on likehood from audio-to-semantic test.

Histogram of true label ranks based on likehood from semantic-to-audio test.

Page 38: Understanding The Semantics of Media Chapter 8 Camilo A. Celis.

Questions1. What kind of application does SVD has? How is it used?

The SVD has also applications in digital signal processing, e.g., as a method for noise reduction. It allows to summarize different kind of video data and combine the results into a common representation.

2. What does MPESAR stands for? What does this system do?(Mixture of Probability Expert for Semantic-Audio Retrieval) Learns the connections between a semantic space and an acoustic space.

-Ex) Given a description of a word, the system finds audio signal that best fits the word.

3. How does MPESAR generally works?Semantic space maps words into a high-dimentional probabilistic

space. Acoustic space describes sounds by a multidimensional vector. A many to many connection.

Page 39: Understanding The Semantics of Media Chapter 8 Camilo A. Celis.

Thank you

Page 40: Understanding The Semantics of Media Chapter 8 Camilo A. Celis.

Questions1. What kind of application does SVD has? How is it used?

The SVD has also applications in digital signal processing, e.g., as a method for noise reduction. It allows to summarize different kind of video data and combine the results into a common representation.

2. What does MPESAR stands for? What does this system do?(Mixture of Probability Expert for Semantic-Audio Retrieval) Learns the connections between a semantic space and an acoustic space.

-Ex) Given a description of a word, the system finds audio signal that best fits the word.

3. How does MPESAR generally works?Semantic space maps words into a high-dimensional probabilistic

space. Acoustic space describes sounds by a multidimensional vector. A many to many connection.

Page 41: Understanding The Semantics of Media Chapter 8 Camilo A. Celis.

Q&A