Video Shot Detection
description
Transcript of Video Shot Detection
Video Shot DetectionVideo Shot Detection
CIS 581 Course ProjectCIS 581 Course ProjectHeshan LinHeshan Lin
AgendaAgendaWhat’s shot detection?What’s shot detection?Classification of shot detectionClassification of shot detectionClose look to hard cuts detectionClose look to hard cuts detectionExperiments and ResultsExperiments and Results
What’s Shot DetectionWhat’s Shot Detection Problem definitionProblem definition
– shot detection: given a video V consisting of n – shot detection: given a video V consisting of n shots, find the beginning and end of each shot.shots, find the beginning and end of each shot.
Also known as shot boundary detection or Also known as shot boundary detection or transition detection.transition detection.
It is fundamental to any kind of video analysis It is fundamental to any kind of video analysis and video application since it enables and video application since it enables segmentation of a video into its basic segmentation of a video into its basic components: the shots.components: the shots.
ClassificationClassification Hard cuts: A cut is an instantaneous transition Hard cuts: A cut is an instantaneous transition
from one scene to the next. There are no from one scene to the next. There are no transitional frames between 2 shots.transitional frames between 2 shots.
Fades: A fade is a gradual transition between a Fades: A fade is a gradual transition between a scene and a constant image (fade-out) or scene and a constant image (fade-out) or between a constant image and a scene (fade-between a constant image and a scene (fade-in).in).
FadesFades During aDuring a fade, images have theirfade, images have their intensities multiplied intensities multiplied
by some valueby some value αα.. During a fade-in, During a fade-in, αα increases from 0 increases from 0 to 1, while during ato 1, while during a fade-outfade-out αα decreases from 1 to 0.decreases from 1 to 0.
ClassificationClassification Hard cuts: A cut is an instantaneous transition Hard cuts: A cut is an instantaneous transition
from one scene to the next.from one scene to the next. Fades: A fade is a gradual transition between a Fades: A fade is a gradual transition between a
scene and a constant image (fade-out) or scene and a constant image (fade-out) or between a constant image and a scene (fade-between a constant image and a scene (fade-in).in).
DissolvesDissolves: A dissolve is a gradual transition from : A dissolve is a gradual transition from one scene to another, in which the first scene one scene to another, in which the first scene fades out and the second scene fades in.fades out and the second scene fades in.
DissolvesDissolvesCombination of fade-in and fade-out.Combination of fade-in and fade-out.
ClassificationClassification Hard cuts: A cut is an instantaneous transition Hard cuts: A cut is an instantaneous transition
from one scene to the next.from one scene to the next. Fades: A fade is a gradual transition between a Fades: A fade is a gradual transition between a
scene and a constant image (fade-out) or scene and a constant image (fade-out) or between a constant image and a scene (fade-between a constant image and a scene (fade-in).in).
Dissolves: A dissolve is a gradual transition from Dissolves: A dissolve is a gradual transition from one scene to another, in which the first scene one scene to another, in which the first scene fades out and the second scene fades in.fades out and the second scene fades in.
WipeWipe: : another common scene break is a wipe, in another common scene break is a wipe, in which awhich a line moves across the screen, with the line moves across the screen, with the new scene appearingnew scene appearing behind the line.behind the line.
Schema of Cut DetectionSchema of Cut DetectionCalculate a time series of discontinuity Calculate a time series of discontinuity
feature values f(n) for each frame. feature values f(n) for each frame. Suppose we use function d(x,y) to Suppose we use function d(x,y) to measure the dissimilarity between frame x measure the dissimilarity between frame x and y. The discontinuity feature value for and y. The discontinuity feature value for frame n is f(n)=d(n-1,n).frame n is f(n)=d(n-1,n).
Pick the cuts position from f(n) based on Pick the cuts position from f(n) based on some threshold techniques.some threshold techniques.
ExampleExample
Features to Measure DissimilarityFeatures to Measure Dissimilarity
Intensity/color histogramIntensity/color histogram
255
0
2))(())(()(),(i
igHifHgHfHd
Edges/contours: Based on edge change ratio (ECR). Let Edges/contours: Based on edge change ratio (ECR). Let σσnn be be the number of edge pixels in frame n, and Xthe number of edge pixels in frame n, and Xnn
inin and X and Xn-1n-1outout the the
number of entering and exiting edge pixels in frames in frames number of entering and exiting edge pixels in frames in frames n and n-1, respectively. The edge change ratio ECRn and n-1, respectively. The edge change ratio ECRnn between between frames n-1 and n is defined as: frames n-1 and n is defined as:
)/,/max( 11 noutnn
innn XXECR
Edges/contours (cont.)Edges/contours (cont.)How to define the entering and exiting edge pixels How to define the entering and exiting edge pixels XXnn
inin and and XXn-1n-1
outout??Suppose we have 2 binary images eSuppose we have 2 binary images en-1n-1 and e and enn. The . The entering edge pixels entering edge pixels XXnn
inin are the are the fraction of edge pixels in efraction of edge pixels in enn which arewhich are more than a fixedmore than a fixed distance r from the closest distance r from the closest edge pixel in eedge pixel in en-1n-1.. Similarly the exiting edge pixels are Similarly the exiting edge pixels are the the fraction of edge pixels in efraction of edge pixels in en-1 n-1 which are farther than r away which are farther than r away from the closest edge pixelfrom the closest edge pixel in ein enn..
En-1 En Impose En to En-1
Not entering edge
Entering edge
imd1 = rgb2gray(im1);Imd2 = rgb2gray(im2);
% black background imagebw1 = edge(imd1, 'sobel'); bw2 = edge(imd2, 'sobel');
% invert image to white backgroundibw2 = 1-bw2; ibw1 = 1-bw1;
s1 = size(find(bw1),1);s2 = size(find(bw1),1);
% dilatese = strel('square',3);dbw1 = imdilate(bw1, se);dbw2 = imdilate(bw2, se);
imIn = dbw1 & ibw2;imOut = dbw2 & ibw1; ECRIn = size(find(imIn),1)/s2;ECROut = size(find(imOut),1)/s1; ECR = max(ECRIn, ECROut);
We can set the distance r by
specify the Dilate parameter
ThresholdingThresholding Global thresholdGlobal threshold
A hard cut is declared each time the discontinuity value A hard cut is declared each time the discontinuity value f(n) surpasses a global thresholds. f(n) surpasses a global thresholds.
Adaptive thresholdAdaptive thresholdA hard cut is detected based on the difference of the A hard cut is detected based on the difference of the current feature values f(n) from its local neighborhood. current feature values f(n) from its local neighborhood. Generally this kind of method has 2 criteria for a hard cut Generally this kind of method has 2 criteria for a hard cut declaration:declaration:- F(n) takes the maximum value inside the - F(n) takes the maximum value inside the neighborhood.neighborhood.- The difference between f(n) and its neighbors’ feature - The difference between f(n) and its neighbors’ feature values is bigger than a given threshold.values is bigger than a given threshold.
ExperimentsExperiments Input: Mr. Beans movie. (80*112, 2363 Input: Mr. Beans movie. (80*112, 2363
frames)frames)Dissimilarity functionDissimilarity function
- Intensity histogram- Intensity histogram- Edge change ratio (ECR)- Edge change ratio (ECR)
Thresholding Thresholding - Adaptive threshold based on statistics - Adaptive threshold based on statistics model.model.
ThresholdingThresholding
),max( rightdrightleftdleft TT
Use a slide window with size 2w+1. Use a slide window with size 2w+1. The middle frame in the window is detected as a cut if:The middle frame in the window is detected as a cut if:
- Its feature value is the maximum in the window. - Its feature value is the maximum in the window. - Its feature value is greater than- Its feature value is greater than
where Twhere Td d is a parameter given a value of 5 in this is a parameter given a value of 5 in this experiment.experiment.
The statistics model is based on following assumption: The statistics model is based on following assumption: The dissimilarity feature values f(n) for a frame comes from The dissimilarity feature values f(n) for a frame comes from two distributions: one for shot boundaries(S) and one for two distributions: one for shot boundaries(S) and one for “not-a-shot-boundary”(N). In general, S has a considerably “not-a-shot-boundary”(N). In general, S has a considerably larger mean and standard deviation than N.larger mean and standard deviation than N.
Threshold
ResultsResults Intensity histogram dissimilarity + adaptive thresholdingIntensity histogram dissimilarity + adaptive thresholding
Results(cont.)Results(cont.) ECR dissimilarity + adaptive thresholdingECR dissimilarity + adaptive thresholding
CompareCompare We compare the cut positions detected by these We compare the cut positions detected by these
2 methods in the following table. From the 2 methods in the following table. From the results we can see the cut detected by these 2 results we can see the cut detected by these 2 methods are pretty stable.methods are pretty stable.
Frame#Frame# Cut1Cut1 Cut2Cut2 Cut3Cut3 Cut4Cut4 Cut5Cut5 Cut6Cut6 Cut7Cut7
Intensity Intensity HistogramHistogram
998 1167 12921292 13591359 2081 2184
ECRECR 8686 998 1167 2081 21292129 2184 23122312
Cut detected in frame 998Cut detected in frame 998
CommentsComments