Video Shot Detection

22
Video Shot Detection Video Shot Detection CIS 581 Course Project CIS 581 Course Project Heshan Lin Heshan Lin

description

Video Shot Detection. CIS 581 Course Project Heshan Lin. Agenda. What’s shot detection? Classification of shot detection Close look to hard cuts detection Experiments and Results. What’s Shot Detection. - PowerPoint PPT Presentation

Transcript of Video Shot Detection

Page 1: Video Shot Detection

Video Shot DetectionVideo Shot Detection

CIS 581 Course ProjectCIS 581 Course ProjectHeshan LinHeshan Lin

Page 2: Video Shot Detection

AgendaAgendaWhat’s shot detection?What’s shot detection?Classification of shot detectionClassification of shot detectionClose look to hard cuts detectionClose look to hard cuts detectionExperiments and ResultsExperiments and Results

Page 3: Video Shot Detection

What’s Shot DetectionWhat’s Shot Detection Problem definitionProblem definition

– shot detection: given a video V consisting of n – shot detection: given a video V consisting of n shots, find the beginning and end of each shot.shots, find the beginning and end of each shot.

Also known as shot boundary detection or Also known as shot boundary detection or transition detection.transition detection.

It is fundamental to any kind of video analysis It is fundamental to any kind of video analysis and video application since it enables and video application since it enables segmentation of a video into its basic segmentation of a video into its basic components: the shots.components: the shots.

Page 4: Video Shot Detection

ClassificationClassification Hard cuts: A cut is an instantaneous transition Hard cuts: A cut is an instantaneous transition

from one scene to the next. There are no from one scene to the next. There are no transitional frames between 2 shots.transitional frames between 2 shots.

Fades: A fade is a gradual transition between a Fades: A fade is a gradual transition between a scene and a constant image (fade-out) or scene and a constant image (fade-out) or between a constant image and a scene (fade-between a constant image and a scene (fade-in).in).

Page 5: Video Shot Detection

FadesFades During aDuring a fade, images have theirfade, images have their intensities multiplied intensities multiplied

by some valueby some value αα.. During a fade-in, During a fade-in, αα increases from 0 increases from 0 to 1, while during ato 1, while during a fade-outfade-out αα decreases from 1 to 0.decreases from 1 to 0.

Page 6: Video Shot Detection

ClassificationClassification Hard cuts: A cut is an instantaneous transition Hard cuts: A cut is an instantaneous transition

from one scene to the next.from one scene to the next. Fades: A fade is a gradual transition between a Fades: A fade is a gradual transition between a

scene and a constant image (fade-out) or scene and a constant image (fade-out) or between a constant image and a scene (fade-between a constant image and a scene (fade-in).in).

DissolvesDissolves: A dissolve is a gradual transition from : A dissolve is a gradual transition from one scene to another, in which the first scene one scene to another, in which the first scene fades out and the second scene fades in.fades out and the second scene fades in.

Page 7: Video Shot Detection

DissolvesDissolvesCombination of fade-in and fade-out.Combination of fade-in and fade-out.

Page 8: Video Shot Detection

ClassificationClassification Hard cuts: A cut is an instantaneous transition Hard cuts: A cut is an instantaneous transition

from one scene to the next.from one scene to the next. Fades: A fade is a gradual transition between a Fades: A fade is a gradual transition between a

scene and a constant image (fade-out) or scene and a constant image (fade-out) or between a constant image and a scene (fade-between a constant image and a scene (fade-in).in).

Dissolves: A dissolve is a gradual transition from Dissolves: A dissolve is a gradual transition from one scene to another, in which the first scene one scene to another, in which the first scene fades out and the second scene fades in.fades out and the second scene fades in.

WipeWipe: : another common scene break is a wipe, in another common scene break is a wipe, in which awhich a line moves across the screen, with the line moves across the screen, with the new scene appearingnew scene appearing behind the line.behind the line.

Page 9: Video Shot Detection

Schema of Cut DetectionSchema of Cut DetectionCalculate a time series of discontinuity Calculate a time series of discontinuity

feature values f(n) for each frame. feature values f(n) for each frame. Suppose we use function d(x,y) to Suppose we use function d(x,y) to measure the dissimilarity between frame x measure the dissimilarity between frame x and y. The discontinuity feature value for and y. The discontinuity feature value for frame n is f(n)=d(n-1,n).frame n is f(n)=d(n-1,n).

Pick the cuts position from f(n) based on Pick the cuts position from f(n) based on some threshold techniques.some threshold techniques.

Page 10: Video Shot Detection

ExampleExample

Page 11: Video Shot Detection

Features to Measure DissimilarityFeatures to Measure Dissimilarity

Intensity/color histogramIntensity/color histogram

255

0

2))(())(()(),(i

igHifHgHfHd

Edges/contours: Based on edge change ratio (ECR). Let Edges/contours: Based on edge change ratio (ECR). Let σσnn be be the number of edge pixels in frame n, and Xthe number of edge pixels in frame n, and Xnn

inin and X and Xn-1n-1outout the the

number of entering and exiting edge pixels in frames in frames number of entering and exiting edge pixels in frames in frames n and n-1, respectively. The edge change ratio ECRn and n-1, respectively. The edge change ratio ECRnn between between frames n-1 and n is defined as: frames n-1 and n is defined as:

)/,/max( 11 noutnn

innn XXECR

Page 12: Video Shot Detection

Edges/contours (cont.)Edges/contours (cont.)How to define the entering and exiting edge pixels How to define the entering and exiting edge pixels XXnn

inin and and XXn-1n-1

outout??Suppose we have 2 binary images eSuppose we have 2 binary images en-1n-1 and e and enn. The . The entering edge pixels entering edge pixels XXnn

inin are the are the fraction of edge pixels in efraction of edge pixels in enn which arewhich are more than a fixedmore than a fixed distance r from the closest distance r from the closest edge pixel in eedge pixel in en-1n-1.. Similarly the exiting edge pixels are Similarly the exiting edge pixels are the the fraction of edge pixels in efraction of edge pixels in en-1 n-1 which are farther than r away which are farther than r away from the closest edge pixelfrom the closest edge pixel in ein enn..

En-1 En Impose En to En-1

Not entering edge

Entering edge

Page 13: Video Shot Detection

imd1 = rgb2gray(im1);Imd2 = rgb2gray(im2);

% black background imagebw1 = edge(imd1, 'sobel'); bw2 = edge(imd2, 'sobel');

% invert image to white backgroundibw2 = 1-bw2; ibw1 = 1-bw1;

s1 = size(find(bw1),1);s2 = size(find(bw1),1);

% dilatese = strel('square',3);dbw1 = imdilate(bw1, se);dbw2 = imdilate(bw2, se);

imIn = dbw1 & ibw2;imOut = dbw2 & ibw1; ECRIn = size(find(imIn),1)/s2;ECROut = size(find(imOut),1)/s1; ECR = max(ECRIn, ECROut);

We can set the distance r by

specify the Dilate parameter

Page 14: Video Shot Detection

ThresholdingThresholding Global thresholdGlobal threshold

A hard cut is declared each time the discontinuity value A hard cut is declared each time the discontinuity value f(n) surpasses a global thresholds. f(n) surpasses a global thresholds.

Adaptive thresholdAdaptive thresholdA hard cut is detected based on the difference of the A hard cut is detected based on the difference of the current feature values f(n) from its local neighborhood. current feature values f(n) from its local neighborhood. Generally this kind of method has 2 criteria for a hard cut Generally this kind of method has 2 criteria for a hard cut declaration:declaration:- F(n) takes the maximum value inside the - F(n) takes the maximum value inside the neighborhood.neighborhood.- The difference between f(n) and its neighbors’ feature - The difference between f(n) and its neighbors’ feature values is bigger than a given threshold.values is bigger than a given threshold.

Page 15: Video Shot Detection

ExperimentsExperiments Input: Mr. Beans movie. (80*112, 2363 Input: Mr. Beans movie. (80*112, 2363

frames)frames)Dissimilarity functionDissimilarity function

- Intensity histogram- Intensity histogram- Edge change ratio (ECR)- Edge change ratio (ECR)

Thresholding Thresholding - Adaptive threshold based on statistics - Adaptive threshold based on statistics model.model.

Page 16: Video Shot Detection

ThresholdingThresholding

),max( rightdrightleftdleft TT

Use a slide window with size 2w+1. Use a slide window with size 2w+1. The middle frame in the window is detected as a cut if:The middle frame in the window is detected as a cut if:

- Its feature value is the maximum in the window. - Its feature value is the maximum in the window. - Its feature value is greater than- Its feature value is greater than

where Twhere Td d is a parameter given a value of 5 in this is a parameter given a value of 5 in this experiment.experiment.

Page 17: Video Shot Detection

The statistics model is based on following assumption: The statistics model is based on following assumption: The dissimilarity feature values f(n) for a frame comes from The dissimilarity feature values f(n) for a frame comes from two distributions: one for shot boundaries(S) and one for two distributions: one for shot boundaries(S) and one for “not-a-shot-boundary”(N). In general, S has a considerably “not-a-shot-boundary”(N). In general, S has a considerably larger mean and standard deviation than N.larger mean and standard deviation than N.

Threshold

Page 18: Video Shot Detection

ResultsResults Intensity histogram dissimilarity + adaptive thresholdingIntensity histogram dissimilarity + adaptive thresholding

Page 19: Video Shot Detection

Results(cont.)Results(cont.) ECR dissimilarity + adaptive thresholdingECR dissimilarity + adaptive thresholding

Page 20: Video Shot Detection

CompareCompare We compare the cut positions detected by these We compare the cut positions detected by these

2 methods in the following table. From the 2 methods in the following table. From the results we can see the cut detected by these 2 results we can see the cut detected by these 2 methods are pretty stable.methods are pretty stable.

Frame#Frame# Cut1Cut1 Cut2Cut2 Cut3Cut3 Cut4Cut4 Cut5Cut5 Cut6Cut6 Cut7Cut7

Intensity Intensity HistogramHistogram

998 1167 12921292 13591359 2081 2184

ECRECR 8686 998 1167 2081 21292129 2184 23122312

Page 21: Video Shot Detection

Cut detected in frame 998Cut detected in frame 998

Page 22: Video Shot Detection

CommentsComments