Video Dissolve and Wipe Detection via Spatio-Temporal Images of Chromatic Histogram Differences...

45
Video Dissolve and Video Dissolve and Wipe Detection via Wipe Detection via Spatio-Temporal Images Spatio-Temporal Images of Chromatic Histogram of Chromatic Histogram Differences Differences Presentation by Kenton Presentation by Kenton Anderson Anderson CMPT 820 CMPT 820 March 3 March 3 rd rd , 2005 , 2005 Mark S. Drew, Ze-Nian Li, and Xiang Zhong Mark S. Drew, Ze-Nian Li, and Xiang Zhong International Conference on Image Processing ICIP'00 Vancouver, pp.III 929-932, Sept. 2000

Transcript of Video Dissolve and Wipe Detection via Spatio-Temporal Images of Chromatic Histogram Differences...

Video Dissolve and Wipe Video Dissolve and Wipe Detection via Spatio-Temporal Detection via Spatio-Temporal

Images of Chromatic Images of Chromatic Histogram DifferencesHistogram Differences

Presentation by Kenton AndersonPresentation by Kenton Anderson

CMPT 820CMPT 820

March 3March 3rdrd, 2005, 2005

Mark S. Drew, Ze-Nian Li, and Xiang ZhongMark S. Drew, Ze-Nian Li, and Xiang Zhong

International Conference on Image Processing ICIP'00Vancouver, pp.III 929-932, Sept. 2000

OverviewOverview

BackgroundBackground IntroductionIntroductionWipe and Cut detectionWipe and Cut detectionDissolve detectionDissolve detectionConclusionsConclusions

BackgroundBackground

What is a shot?What is a shot?Uninterrupted segment of video timeUninterrupted segment of video timeThe boundary between two shots is a camera The boundary between two shots is a camera

breakbreakThree major types of transitions:Three major types of transitions:

Cut Cut (instant)(instant)WipeWipe (gradual)(gradual)DissolveDissolve (gradual)(gradual)

Gradual TransitionsGradual Transitions

Wipe transitionWipe transitionMoving boundary line between two shots Moving boundary line between two shots

crossing the screen such that one shot crossing the screen such that one shot replaces the otherreplaces the other

Dissolve transitionDissolve transitionOne shot blends smoothly into a second shotOne shot blends smoothly into a second shot

IntroductionIntroduction

Content-Based Image/Video Search, Content-Based Image/Video Search, Retrieval, and SegmentationRetrieval, and Segmentation

Why detect video transitions?Why detect video transitions?Segmentation is an important basic step!Segmentation is an important basic step!Before scenes can be searched for content, Before scenes can be searched for content,

the location of the scenes have to be the location of the scenes have to be determineddetermined

IntroductionIntroduction

This paper presents methods to detect This paper presents methods to detect gradual transitions using 2D chromaticity gradual transitions using 2D chromaticity histogram metrics:histogram metrics:Histogram intersection for wipesHistogram intersection for wipesColor-distance Histogram metric for dissolvesColor-distance Histogram metric for dissolves

Both generate potential indicators of shot Both generate potential indicators of shot transitions, with good resultstransitions, with good results

Wipe and Cut DetectionWipe and Cut Detection

In a wipe, a boundary line crosses the first In a wipe, a boundary line crosses the first shot, revealing the second shotshot, revealing the second shot

Wipe and Cut DetectionWipe and Cut Detection Ngo et al. “Detection of gradual transitions through temporal slice analysis”Ngo et al. “Detection of gradual transitions through temporal slice analysis”

Taking the pixels from the middle column and Taking the pixels from the middle column and placing them sideways, stacking them over timeplacing them sideways, stacking them over time

Y

X

Column C at time t1 Column C

t

Y

t1

Wipe and Cut DetectionWipe and Cut Detection

Detect lines in the resultant spatio-temporal imageDetect lines in the resultant spatio-temporal image

Wipe and Cut DetectionWipe and Cut Detection

Instead of using pixels, convert to 2D Instead of using pixels, convert to 2D chromaticity coordinateschromaticity coordinates r = r = R R

R + G + BR + G + Bg = g = G G

R + G + BR + G + BRecall chromaticity from our Recall chromaticity from our class notesclass notes

Wipe and Cut DetectionWipe and Cut Detection

2D chromaticity conversion effectively 2D chromaticity conversion effectively eliminates the shadowseliminates the shadows

Form a 2D chromaticity histogram Form a 2D chromaticity histogram for each for each columncolumnUsing the DC component of the framesUsing the DC component of the frames

Using Using Histogram IntersectionHistogram Intersection, compare a , compare a frame to its previous frame to detect frame to its previous frame to detect differences.differences.

Wipe and Cut DetectionWipe and Cut Detection

From frame to frame, the histogram intersection From frame to frame, the histogram intersection value for a column stays nearly the samevalue for a column stays nearly the same

When the Wipe Boundary hits that column, the When the Wipe Boundary hits that column, the histogram intersection value is near zerohistogram intersection value is near zero

t

Each element in a row represents a histogram intersection value for each column in the image

(Black == 0)

X

Wipe

Cut

Wipe and Cut DetectionWipe and Cut Detection

ConclusionsConclusions Note that this techniques takes into account the Note that this techniques takes into account the

entire image frame entire image frame Not just a sliceNot just a slice

The previous image has no edge enhancement The previous image has no edge enhancement performed on itperformed on it Raw dataRaw data

Dissolve DetectionDissolve Detection

Replaces every pixel with a mixture of the two Replaces every pixel with a mixture of the two shots over time, gradually replacing the first by shots over time, gradually replacing the first by the secondthe second

Each pixel is affected graduallyEach pixel is affected gradually

Dissolve DetectionDissolve Detection

Frame by frame of a cross dissolveFrame by frame of a cross dissolve

DiagramDiagram

Dissolve DetectionDissolve Detection

For dissolve detection, 2D Cb-Cr chromaticity For dissolve detection, 2D Cb-Cr chromaticity space is adoptedspace is adopted Recall from our class notes that YCbCr Colour model Recall from our class notes that YCbCr Colour model

is used in JPEG image compression and MPEG video is used in JPEG image compression and MPEG video compressioncompression

YCbCr is closely related to YUVYCbCr is closely related to YUV Y’ is the luma (for gamma corrected signals)Y’ is the luma (for gamma corrected signals) U and V is the chrominanceU and V is the chrominance

U = B’ – Y’U = B’ – Y’ V = R’ – Y‘V = R’ – Y‘

Dissolve DetectionDissolve Detection

Define transition as;Define transition as;

R = A + R = A + αα(t)(B – A)(t)(B – A) (1)(1)

Where A and B are 2-vectors for video A and Where A and B are 2-vectors for video A and video B, in Cb-Cr spacevideo B, in Cb-Cr space

αα(t) is a transition function(t) is a transition functionαα(t) = Kt, with Kt(t) = Kt, with Ktmaxmax ≡ 1 ≡ 1 (2)(2)

Dissolve DetectionDissolve Detection

Histogram Intersection fails on simple Histogram Intersection fails on simple cases for dissolve detectioncases for dissolve detectionFor example, uniformly-coloured still imagesFor example, uniformly-coloured still imagesH (K, M) never really drops to zeroH (K, M) never really drops to zero

To counter this problem, use a histogram-To counter this problem, use a histogram-difference metricdifference metric

Dissolve DetectionDissolve Detection

Histogram-difference metric:Histogram-difference metric:Hafner et al.’s metric is a weighted distance Hafner et al.’s metric is a weighted distance

between colour distributions of two images, between colour distributions of two images, generating a histogram distance measuregenerating a histogram distance measure

Histogram difference DHistogram difference D22

DD22 = z = zTTAzAz

Dissolve DetectionDissolve Detection

Summary of modificationsSummary of modificationsFor the histogram difference DFor the histogram difference D22

Use 2D CbCr chromaticity space (vs 3D color)Use 2D CbCr chromaticity space (vs 3D color)Use only DC components of videoUse only DC components of videoAnalyzes actual pixel values (vs histogram)Analyzes actual pixel values (vs histogram)Use Euclidean Distance metric for differenceUse Euclidean Distance metric for difference

DerivationDerivation

Dissolve DetectionDissolve Detection

Results of modificationsResults of modificationsFor time tFor time t11 and t and t22 (beginning and ending of dissolve transition)(beginning and ending of dissolve transition)

Temporal differences for each column is Temporal differences for each column is k(tk(t11 - t - t22))22, k is a constant, k is a constant

If tIf t11 – t – t22 is constant, D is constant, D22 is constant is constantNormalized DNormalized D22 is approximately is approximately

0 outside a dissolve0 outside a dissolve1 during a dissolve1 during a dissolve

Dissolve DetectionDissolve Detection

Results of modifications cont’dResults of modifications cont’dFor time tFor time t11 and t and t22 (beginning and ending of dissolve transition)(beginning and ending of dissolve transition)

If tIf t11 is fixed, and t is fixed, and t22 varies, varies, √√DD22 is linear is linearDD22 has 3 components, each quadratic in time, has 3 components, each quadratic in time,

thus having a linear derivativethus having a linear derivative

Dissolve DetectionDissolve Detection

Fully derived expression Fully derived expression for linear transitionfor linear transition

DD22 = 2(1/d = 2(1/d22maxmax) K) K22(t(t11-t-t22))2 2 ∑∑∑∑(B(Bii-A-Aii))TT(B(Bjj-A-Ajj))

i ji j

Dissolve DetectionDissolve Detection

ProcessProcess 2 frames as part of a dissolve in Cb-Cr space2 frames as part of a dissolve in Cb-Cr space

t1: initial frame t2: time-varying frame

D2

Result

(t1 – t2) = ∆t is NOT constant

√D2

Dissolve DetectionDissolve Detection

Results:Results:

√√DD22, differencing frame to initial frame at a 1 frame interval, differencing frame to initial frame at a 1 frame interval

Approx. 1

Approx. 0

Dissolve DetectionDissolve Detection

ProcessProcess 2 frames as part of a dissolve in Cb-Cr space2 frames as part of a dissolve in Cb-Cr space

t1: frame A t2: frame B

D2

Result

(t1 – t2) = ∆t is constantDerivative D2 is linear

⌡D2

Dissolve DetectionDissolve Detection

Results:Results:

•DD22, differencing for a constant t, differencing for a constant t11-t-t22

•Boundaries of the transition are evidentBoundaries of the transition are evident

•Values in Transition periods are Values in Transition periods are relatively constant relatively constant

Time tDerivation of D22

Dissolve DectectionDissolve Dectection

ConclusionsConclusions Use of multiple columns and rows provides a large Use of multiple columns and rows provides a large

number of descriptors for gradual transitionsnumber of descriptors for gradual transitions For constant tFor constant t11 – t – t22

DD22 is constant during transitions, 0 otherwise is constant during transitions, 0 otherwise For fixed tFor fixed t11

DD22 is 1 during transition, 0 otherwise is 1 during transition, 0 otherwise In testing, this measure performs best when each In testing, this measure performs best when each

video in the dissolve does not change much during video in the dissolve does not change much during the transitionthe transition

ConclusionsConclusions

2 new measures are presented for 2 new measures are presented for detecting cuts, wipes and dissolvesdetecting cuts, wipes and dissolves

Both use multiple columns (or rows or Both use multiple columns (or rows or diagonals) to generate descriptorsdiagonals) to generate descriptors

Histogram intersection is fast and effectiveHistogram intersection is fast and effective In Dissolve testing, the measures perform In Dissolve testing, the measures perform

best when each video in the dissolve does best when each video in the dissolve does not change much during the transitionnot change much during the transition

Video Dissolve and Wipe DetectionVideo Dissolve and Wipe Detection

The EndThe End

Histogram Difference DerivationHistogram Difference Derivation

Histogram difference DHistogram difference D22

DD22 = z = zTTAz Az (3)(3)

A = [aA = [aijij] is a symmetric matrix where a] is a symmetric matrix where a ijij denotes denotes

similarity between bins i and jsimilarity between bins i and j AAred,orange,bluered,orange,blue = =

11 0.90.9 00

0.90.9 11 00

00 00 11

R

O

B

R O B

Red and Orange are considered highly similar

Histogram Difference DerivationHistogram Difference Derivation

DD22 = z = zTTAz Az (3)(3)

For aFor aijij, ,

aaijij = (1 – d = (1 – dijij/d/dmaxmax)) (4)(4)

ddijij defined as a three-dimensional colour defined as a three-dimensional colour differencedifference

Vector z is a histogram-difference vector Vector z is a histogram-difference vector (for vectorized histograms)(for vectorized histograms)For example, z would be of length 256 if our For example, z would be of length 256 if our

chromaticity histograms were 16x16chromaticity histograms were 16x16

Histogram Difference DerivationHistogram Difference Derivation

Instead of 3D colour space, we use 2D Instead of 3D colour space, we use 2D CbCr chrominance spaceCbCr chrominance space

Also, use an Also, use an Euclidean distanceEuclidean distance metric metricaaijij will no longer be linear under a temporal will no longer be linear under a temporal

transition with linear transition with linear αα(t)(t)This modification maintains the linearity:This modification maintains the linearity:

aaijij = (1 – d = (1 – d22ijij/d/d22

maxmax)) (5)(5)

Histogram Difference DerivationHistogram Difference Derivation

Suppose we use only DC componentsSuppose we use only DC componentsEach frame will consist of only 1/8Each frame will consist of only 1/8 thth of the of the

number of rows in an imagenumber of rows in an imageRecall equation (3)Recall equation (3)

DD22 = z = zTTAz Az (3)(3)z is the difference of 2 histograms, x and yz is the difference of 2 histograms, x and y

z = (x – y)z = (x – y)x and y are normalized to 0 x and y are normalized to 0 ≤ x≤ xii, y, yii ≤ 1, ∑x = ∑y = 1 ≤ 1, ∑x = ∑y = 1Then -1Then -1 ≤ z≤ zii ≤ 1 ≤ 1

Histogram Difference DerivationHistogram Difference Derivation

To generate an analytic expression:To generate an analytic expression:Assume x and y are infinitely precise,Assume x and y are infinitely precise,

z = (1, 1, 1, …, -1, -1, -1)z = (1, 1, 1, …, -1, -1, -1) In our video transition context, In our video transition context,

This means 1’s entries for current column for the This means 1’s entries for current column for the previous frame, and -1’s entries for the current previous frame, and -1’s entries for the current column in the current framecolumn in the current frame

Histogram Difference DerivationHistogram Difference Derivation Expanding DExpanding D22 = z = zTTAz, given assumptions:Az, given assumptions:

Where R is the CbCr 2-vector at time tWhere R is the CbCr 2-vector at time t11 for the i-th row for the i-th row in the current columnin the current column

Differencing between time tDifferencing between time t11 and time t and time t22

Histogram Difference DerivationHistogram Difference Derivation

With a Euclidean distance metric and With a Euclidean distance metric and substituting equation (1): substituting equation (1): R = A + R = A + αα(t)(B – A)(t)(B – A)

Histogram Difference DerivationHistogram Difference Derivation

For linear transition (as is usually the case for For linear transition (as is usually the case for dissolves), the previous equation can be dissolves), the previous equation can be simplified:simplified:

Since the sum above is simply a constant, then Since the sum above is simply a constant, then for constant (tfor constant (t11 – t – t22), the difference D), the difference D22 is is constantconstant

over the transition!over the transition!Back

Back

Histogram IntersectionHistogram Intersection

Given a pair of histograms, K and M, each Given a pair of histograms, K and M, each containing containing nn buckets, the intersection is: buckets, the intersection is: nn

H =H = ∑∑min(Kmin(Kjj, M, Mjj))

j=1j=1

The result of the intersection is the number The result of the intersection is the number of pixels in M that have corresponding of pixels in M that have corresponding pixels of the same colour in Kpixels of the same colour in K

Histogram IntersectionHistogram Intersection

0

1

2

3

4

5

6

7

8

0 1 2 30

1

2

3

4

5

6

7

8

0 1 2 3

Example 1Two sample histograms for 4-bit greyscale 4x4 images

K1 M

H (K, M) = 6 + 0 + 8 + 2 = 16

Histogram IntersectionHistogram Intersection

0

1

2

3

4

5

6

7

8

0 1 2 30

1

2

3

4

5

6

7

8

0 1 2 3

Example 2Two sample histograms for 4-bit greyscale 4x4 images

K2 M

H (K, M) = 1 + 0 + 1 + 2 = 4

Histogram IntersectionHistogram Intersection

Normalized between 0 and 1:Normalized between 0 and 1:

H(K, M) = H(K, M) =

Closer to zero, less histogram matchCloser to zero, less histogram match

∑∑min(Kmin(Kjj, M, Mjj))

∑ ∑MMjj

Histogram IntersectionHistogram Intersection

High-level representationHigh-level representationPoor matchPoor match

KK M M

Close matchClose match

KK M M

BackBack

Euclidean DistanceEuclidean Distance

If u = (xIf u = (x11, y, y11) and v = (x) and v = (x22, y, y22) are two points ) are two points on the plane, their Euclidean distance is on the plane, their Euclidean distance is given by:given by:

√√(x(x11 – x – x22))22 + (y + (y11 – y – y22))22

Geometrically, it's the length of the segment joining u and vGeometrically, it's the length of the segment joining u and v For dFor dijij, which was previously defined as our 3D colour , which was previously defined as our 3D colour

differencedifferenceBack