466 IEEE TRANSACTIONS ON INFORMATION FORENSICS AND ...linoc/lino_coria_IEEE_IFS_2008.pdf · 466...

9
466 IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, VOL. 3, NO. 3, SEPTEMBER 2008 A Video Watermarking Scheme Based on the Dual-Tree Complex Wavelet Transform Lino E. Coria, Member, IEEE, Mark R. Pickering, Member, IEEE, Panos Nasiopoulos, Member, IEEE, and Rabab Kreidieh Ward, Fellow, IEEE Abstract—A watermarking scheme that discourages theater camcorder piracy through the enforcement of playback control is presented. In this method, the video is watermarked so that its display is not permitted if a compliant video player detects the watermark. A watermark that is robust to geometric dis- tortions (rotation, scaling, cropping) and lossy compression is required in order to block access to media content that has been re-recorded with a camera inside a movie theater. We introduce a new video watermarking algorithm for playback control that takes advantage of the properties of the dual-tree complex wavelet transform. This transform offers the advantages of the regular and the complex wavelets (perfect reconstruction, shift invariance, and good directional selectivity). Our method relies on these characteristics to create a watermark that is robust to geometric distortions and lossy compression. The proposed scheme is simple to implement and outperforms comparable methods when tested against geometric distortions. Index Terms—Access control, complex wavelet transform (CWT), content dependent, geometric attacks, watermarking. I. INTRODUCTION P IRACY, the practice of selling, acquiring, copying, or distributing copyrighted material without permission is a great concern to Hollywood studios and independent film- makers. Although digital technology has brought many benefits to the content creators and the public, it has also increased the ease by which movies can be pirated. This paper addresses theatrical camcorder piracy which is one of the most common ways of illegally copying a movie [1]. This method consists of someone entering a poorly supervised theater with a camcorder and creating a copy of the movie that is being shown. These recordings are illegally duplicated, packaged, and distributed all over the world. Consequently, the film appears in street markets just days after the theatrical release. This translates in a significant loss of revenue for film producers. Manuscript received October 5, 2007; revised May 15, 2008. Published Au- gust 13, 2008 (projected). This work was supported in part by the Mexican Council for Science and Technology (CONACYT) and in part by ITESO Uni- versity, Mexico. The associate editor coordinating the review of this manuscript and approving it for publication was Dr. Ingemar Cox. L. E. Coria, P. Nasiopoulos, and R. K. Ward are with the University of British Columbia, Vancouver, BC V6T 1Z4 Canada (e-mail: [email protected]; [email protected]; [email protected]). M. R. Pickering is with the School of Information Technology and Elec- trical Engineering, University College, The University of New South Wales, Australian Defense Force Academy, Canberra, ACT, 2600, Australia (e-mail: [email protected]). Digital Object Identifier 10.1109/TIFS.2008.927421 Currently, some technical measures that prevent this practice are being developed. Camcorder jamming technologies, for in- stance, have been proposed to disable camcorders inside movie theaters [1]. For a little more than a decade, watermarking tech- niques have been designed to, among other purposes, control access to digital content [2]. In the case of a playback control application, the watermark embedded in the video sequence is designed to provide information on whether video players are authorized to display the content or not [3]. Compliant devices detect the watermark and obey the encoded usage restrictions. Controlling access to media content that was re-recorded with a camera inside a movie theater is a challenging problem. To begin with, the recorded video might be a slightly resized, ro- tated, and cropped version of the original content. Furthermore, these copies are also subjected to video compression. Since the original content is not available during the decoding process (i.e., it is a blind procedure), extracting the watermark is not a straightforward task. The decoding process must, to a certain extent, be robust to some geometric distortions (rotation and scaling), as well as cropping and lossy compression. Several watermarking methods that are robust to common geometric distortions have been presented. For example, in [4], an image watermarking method based on the Fourier–Mellin transform is proposed. The scheme is robust to rotation and scaling but weak to distortions caused by lossy compression. Another algorithm is presented in [5]. The watermark is em- bedded into a 1-D signal, which is obtained by taking the Fourier transform of the image, resampling it into log-polar coordinates, and integrating along the radial dimension. The method is ro- bust to rotation, scaling, and translation. However, the scheme cannot withstand cropping. In [6], two watermarks are employed. The first one is used to embed the message while the second one, a 0-b watermark, is employed as a geometric reference. This reference watermark is embedded in the spatial domain which results in low robust- ness. Information hidden in the space domain can be easily lost to quantization, which makes the watermarking scheme vulner- able to lossy compression and other attacks. Once the reference watermark has been changed, the decoder assumes that there is no watermark embedded in the content and, therefore, does not search for the hidden message. A content-based image watermarking method is offered in [7], where robustness to geometric attacks is achieved using fea- ture points from the image. This scheme is shown to be suc- cessful to certain attacks, but the watermark detection process is computationally intensive and, therefore, may not be practical for real-time video applications. 1556-6013/$25.00 © 2008 IEEE

Transcript of 466 IEEE TRANSACTIONS ON INFORMATION FORENSICS AND ...linoc/lino_coria_IEEE_IFS_2008.pdf · 466...

Page 1: 466 IEEE TRANSACTIONS ON INFORMATION FORENSICS AND ...linoc/lino_coria_IEEE_IFS_2008.pdf · 466 IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, VOL. 3, NO. 3, SEPTEMBER 2008

466 IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, VOL. 3, NO. 3, SEPTEMBER 2008

A Video Watermarking Scheme Based on theDual-Tree Complex Wavelet Transform

Lino E. Coria, Member, IEEE, Mark R. Pickering, Member, IEEE, Panos Nasiopoulos, Member, IEEE, andRabab Kreidieh Ward, Fellow, IEEE

Abstract—A watermarking scheme that discourages theatercamcorder piracy through the enforcement of playback controlis presented. In this method, the video is watermarked so thatits display is not permitted if a compliant video player detectsthe watermark. A watermark that is robust to geometric dis-tortions (rotation, scaling, cropping) and lossy compression isrequired in order to block access to media content that has beenre-recorded with a camera inside a movie theater. We introducea new video watermarking algorithm for playback control thattakes advantage of the properties of the dual-tree complex wavelettransform. This transform offers the advantages of the regularand the complex wavelets (perfect reconstruction, shift invariance,and good directional selectivity). Our method relies on thesecharacteristics to create a watermark that is robust to geometricdistortions and lossy compression. The proposed scheme is simpleto implement and outperforms comparable methods when testedagainst geometric distortions.

Index Terms—Access control, complex wavelet transform(CWT), content dependent, geometric attacks, watermarking.

I. INTRODUCTION

P IRACY, the practice of selling, acquiring, copying, ordistributing copyrighted material without permission is

a great concern to Hollywood studios and independent film-makers. Although digital technology has brought many benefitsto the content creators and the public, it has also increased theease by which movies can be pirated. This paper addressestheatrical camcorder piracy which is one of the most commonways of illegally copying a movie [1]. This method consists ofsomeone entering a poorly supervised theater with a camcorderand creating a copy of the movie that is being shown. Theserecordings are illegally duplicated, packaged, and distributedall over the world. Consequently, the film appears in streetmarkets just days after the theatrical release. This translates ina significant loss of revenue for film producers.

Manuscript received October 5, 2007; revised May 15, 2008. Published Au-gust 13, 2008 (projected). This work was supported in part by the MexicanCouncil for Science and Technology (CONACYT) and in part by ITESO Uni-versity, Mexico. The associate editor coordinating the review of this manuscriptand approving it for publication was Dr. Ingemar Cox.

L. E. Coria, P. Nasiopoulos, and R. K. Ward are with the University ofBritish Columbia, Vancouver, BC V6T 1Z4 Canada (e-mail: [email protected];[email protected]; [email protected]).

M. R. Pickering is with the School of Information Technology and Elec-trical Engineering, University College, The University of New South Wales,Australian Defense Force Academy, Canberra, ACT, 2600, Australia (e-mail:[email protected]).

Digital Object Identifier 10.1109/TIFS.2008.927421

Currently, some technical measures that prevent this practiceare being developed. Camcorder jamming technologies, for in-stance, have been proposed to disable camcorders inside movietheaters [1]. For a little more than a decade, watermarking tech-niques have been designed to, among other purposes, controlaccess to digital content [2]. In the case of a playback controlapplication, the watermark embedded in the video sequence isdesigned to provide information on whether video players areauthorized to display the content or not [3]. Compliant devicesdetect the watermark and obey the encoded usage restrictions.

Controlling access to media content that was re-recorded witha camera inside a movie theater is a challenging problem. Tobegin with, the recorded video might be a slightly resized, ro-tated, and cropped version of the original content. Furthermore,these copies are also subjected to video compression. Since theoriginal content is not available during the decoding process(i.e., it is a blind procedure), extracting the watermark is not astraightforward task. The decoding process must, to a certainextent, be robust to some geometric distortions (rotation andscaling), as well as cropping and lossy compression.

Several watermarking methods that are robust to commongeometric distortions have been presented. For example, in [4],an image watermarking method based on the Fourier–Mellintransform is proposed. The scheme is robust to rotation andscaling but weak to distortions caused by lossy compression.

Another algorithm is presented in [5]. The watermark is em-bedded into a 1-D signal, which is obtained by taking the Fouriertransform of the image, resampling it into log-polar coordinates,and integrating along the radial dimension. The method is ro-bust to rotation, scaling, and translation. However, the schemecannot withstand cropping.

In [6], two watermarks are employed. The first one is used toembed the message while the second one, a 0-b watermark, isemployed as a geometric reference. This reference watermarkis embedded in the spatial domain which results in low robust-ness. Information hidden in the space domain can be easily lostto quantization, which makes the watermarking scheme vulner-able to lossy compression and other attacks. Once the referencewatermark has been changed, the decoder assumes that there isno watermark embedded in the content and, therefore, does notsearch for the hidden message.

A content-based image watermarking method is offered in[7], where robustness to geometric attacks is achieved using fea-ture points from the image. This scheme is shown to be suc-cessful to certain attacks, but the watermark detection processis computationally intensive and, therefore, may not be practicalfor real-time video applications.

1556-6013/$25.00 © 2008 IEEE

Page 2: 466 IEEE TRANSACTIONS ON INFORMATION FORENSICS AND ...linoc/lino_coria_IEEE_IFS_2008.pdf · 466 IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, VOL. 3, NO. 3, SEPTEMBER 2008

CORIA et al.: VIDEO WATERMARKING SCHEME BASED ON THE DT CWT 467

Multiresolution analysis has been considered to be an impor-tant tool for designing watermarks that can withstand geometricdistortions. A method for image watermarking in the waveletdomain is presented in [8]. The watermark is applied to the dis-crete wavelet transform (DWT) coefficients of a subimage. Thissubimage is constructed from the original content using smallblocks that are chosen via a chaotic map. Although the schemeis extremely robust to cropping, it does not provide an adequatesolution for a rotation attack. A video watermarking method thatalso relies on wavelets is presented in [9]. In this case, the water-mark is embedded in every video frame by applying DWT to theframes and replacing certain coefficients with the maximum orminimum value of their neighboring coefficients. This schemewas proven to be robust to mild geometric attacks and high com-pression. However, since the embedding process is blind and aperceptual model cannot be incorporated, it is impossible to con-trol the amount of distortion introduced in the frames.

Complex wavelets have also been employed to create wa-termarks that are robust to geometric distortions. The complexwavelet transform is an overcomplete transform and, therefore,creates redundant coefficients but it also offers some advantagesover the regular wavelet transform. Two of the main features ofcomplex wavelets are shift invariance and directional selectivity[10]. These properties can be employed to produce a watermarkthat can be decoded even after the original content has under-gone extensive geometric distortions. When dealing with sig-nals that have more than one dimension, the dual-tree complexwavelet transform (DT CWT) [10] is a particularly valuable so-lution since it adds perfect reconstruction to the list of desirableproperties that regular complex wavelets have.

Most watermarking methods rely on embedding a pseudo-random pattern in the transform coefficients of the host imageor frame. The same, however, cannot be achieved with the DTCWT. DT CWT is a redundant transformation and, therefore,some components of the watermark might be lost during the in-verse transform process [11]. In order to reduce this problem,a watermark that consists of a pseudorandom sequence con-structed with valid CWT transform coefficients is proposed in[11] and [12]. A four-level DT CWT is applied to the originalcontent and the watermark is added to the coefficients fromlevels 2 and 3. Although the ideas portrayed in these effortsshow some potential, the robustness of the schemes is nevertested. Another watermarking method that uses DT CWT is pre-sented in [13]. In this method, the content is also subjected tofour-level DT CWT decomposition and the watermark is addedto the two highest levels using the spread-spectrum technique.However, the decoding process is not blind and, therefore, thescheme is not useful for playback control or any other applica-tions where the original content is not available at the decoderend. More recently, a blind decoding watermarking scheme thatuses DT CWT to overcome geometric distortions was proposedby the authors in [14]. This method, designed specifically forplayback control of digital content, is robust to rotation, crop-ping, and H.264 compression. However, since the redundant na-ture of the transform is not taken into account, the watermarkneeds to be extracted from a considerable number of framesin order to reach the correct decision (play or do not play thecontent).

Fig. 1. Basic configuration of the dual-tree filtering approach used to obtainthe DT CWT coefficients (for a real 1-D signal ����).

This paper introduces a new watermarking method that is ro-bust to lossy compression and some common geometric attacks,such as rotation, scaling, and cropping. In our method, the wa-termark is a random set of 1’s and 1’s. A one-level DT CWT isapplied to this watermark and the coefficients of this transforma-tion become the data that are embedded into the video sequence.Every frame of the original video sequence is transformed witha four-level DT CWT. The content is examined to determinehow strongly the watermark embedding should be. Thus, thewatermark coefficients are properly weighted and added to thecoefficients of levels 3 and 4.

The remainder of this paper is structured as follows. Section IIincludes a brief description of DT CWT and depicts our method.Performance evaluations are presented in Section III. Section IVoffers the conclusions.

II. PROPOSED METHOD

A. Brief Introduction to the DT CWT

The DT CWT was introduced in [15]. This transform has thedesirable properties of the DWT and the CWT: perfect recon-struction, approximate shift invariance, good directional selec-tivity, limited redundancy, and efficient order- computation[10]. This transform is a variation of the original DWT with themain difference being that it uses two filter trees instead of one,as shown in Fig. 1. For a 1-D signal, the use of the two filter treesresults in twice the number of wavelet coefficients as the orig-inal DWT. The coefficients produced by these two trees formtwo sets that can be combined to form one set of complex coef-ficients of the form

or in polar form as

where

and

The dual-tree approach provides wavelet coefficients that areapproximately shift invariant (i.e., small shifts in the input signalwill not cause major variations in the distribution of energy ofDT CWT coefficients at different scales). An insight into the

Page 3: 466 IEEE TRANSACTIONS ON INFORMATION FORENSICS AND ...linoc/lino_coria_IEEE_IFS_2008.pdf · 466 IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, VOL. 3, NO. 3, SEPTEMBER 2008

468 IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, VOL. 3, NO. 3, SEPTEMBER 2008

Fig. 2. Typical impulse responses of the high-pass decimation filters for eachfilter tree. In this case, quarter sample shift orthogonal (Q-Shift) 18-tap filters.

Fig. 3. Two-dimensional impulse responses of the reconstruction filters in aDT CWT.

shift-invariant nature of DT CWT can be gained by observingthe typical impulse responses of the highpass decimation fil-ters for each tree. Fig. 2 shows these two impulse responses. InFig. 1, the filters used in tree B are designed to produce outputsat sample locations that are discarded in tree A.

Approximate shift invariance is a particularly useful propertyof DT CWT that can be exploited when designing a video water-mark that is robust to geometric distortions. If a frame is resam-pled after scaling or rotation, DT CWT should produce approx-imately the same set of coefficients as the original frame. Thisproperty does not hold for other transforms, such as the DCT,DFT, or DWT.

For 2-D signals, DT CWT requires a4:1 increase in the numberof coefficients and provides approximate shift invariance in thehorizontal and vertical directions. While 2-D DWT producesthree subbands at each level, corresponding to LH, HH, and HLfiltering (0 , 45 , and 90 , respectively), 2-D DT CWT producessix subbands that correspond to the outputs of six directionalfilters oriented at angles of , , and . Fig. 3 showsthe 2-D impulse responses of the reconstruction filters in 2-D DTCWT. If the level (or scale) of decomposition is denoted by andthe direction of the filter is denoted by , then the set of bandpasscomplex wavelet coefficients at level s can be written as

(1)

for

Fig. 4. Structure of the DT CWT coefficients for a four-level decomposition.(a) For each level, there are six subbands that correspond to the output of sixdirectional filters oriented at angles of��� ,��� , and��� . (b) The notationemployed for the proposed method.

where and are the dimensions of the video frame in pixels.The variables and specify the location of the complex co-efficients in each subband. Fig. 4(a) shows a wavelet-type outputstructure for the six directional subbands at each level of DTCWT and Fig. 4(b) describes the notation that will be employedthroughout this paper.

Since the watermarking application considered is playbackcontrol, the decoder must decide whether the watermark ispresent. This means that a watermark does not need to carryany other information: its presence or absence is all that is re-quired. For example, if a watermark is detected by a compliantDVD player, the copy will be considered to be of illegal originand the DVD player will not play the movie.

B. Creating the Watermark

In our method, the watermark is inserted in every frame ofthe video sequence. To provide some robustness to lossy com-pression, the watermark will be embedded in the coefficientsof higher decomposition levels. In our implementation, the wa-termark is embedded in levels 3 and 4 of a 4-level DT CWTdecomposition. The watermark is a 2-D array that is 64 timessmaller than the video frame where it will be embedded (i.e., itsheight and width are one-eighth of the frame’s height and width,respectively). The watermark is a pseudorandom sequence of1’s and 1’s. It is created using a key

where is a constant (positive integer) provided by the user.is a positive integer number that changes every frames ac-

Page 4: 466 IEEE TRANSACTIONS ON INFORMATION FORENSICS AND ...linoc/lino_coria_IEEE_IFS_2008.pdf · 466 IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, VOL. 3, NO. 3, SEPTEMBER 2008

CORIA et al.: VIDEO WATERMARKING SCHEME BASED ON THE DT CWT 469

Fig. 5. Level-1 DT CWT transform is applied to the watermark �. The coef-ficients of� � � � �� are the data to be embedded in the video frames.

cording to some formula. The use of the same for consecu-tive frames offers some robustness to temporal synchronizationattacks. This is as long as is small enough (so that an attackercannot detect and remove the watermark by frame averaging)but long enough (so that if some frames are dropped, the wa-termark can still be detected). Although this process provideslimited robustness to temporal synchronization, more elaboratetemporal synchronization strategies, such as the ones presentedin [16], could be later incorporated into our method.

Usually, watermarking algorithms rely on the addition ofa pseudorandom sequence (such as ) to the host contentcoefficients in some frequency domain. This approach, how-ever, cannot be used in this exact fashion when working inthe DT CWT domain. The reason is that DT CWT is a redun-dant transformation. Thus, some components of the arbitrarypseudorandom sequence in the DT CWT domain may be lostduring the DT CWT inverse transformation process. The lostinformation corresponds to the part of the pseudorandom se-quence that lies in the null space of the inverse DT CWT [12].Thus, to reduce this information loss, we embed the DT CWTcoefficients of the watermark in the host content instead ofembedding the actual watermark. Thus, the one-level DT CWTtransform is applied to the watermark (as in Fig. 5). Thisresults in a low-pass component and six subbands thatcontain the details . The coefficients of the sixsubbands form the data to be embedded in the host video frame.

C. Embedding the Watermark

A four-level DT CWT is applied to every video frame. Foreach frame, the watermark is embedded in the coefficients oflevel 3 and level 4 since coefficients at finer levels are not robustto compression. Please note that the number of coefficients inlevel 1 of the watermark transform are equal to the number ofcoefficients in level 4 of the frame’s transform. This means thatthe data will be embedded several times and in different places.The embedding strength is decided based on information fromthe frame’s coefficients of level 2. The embedding algorithm isnow described in detail.

1) Perceptual Masks: Robustness to compression is in-creased when the watermark is embedded in the frame’scoefficients that are located in the highest levels of the DTCWT transform. This process, however, might significantlydecrease the content’s fidelity since the human visual systemis very susceptible to changes in the low frequencies. Betterresults can be achieved by the prior examination of the content(i.e., before making any decisions on how strong the watermark

embedding should be). This can be done by using perceptualmasks. These masks provide information on how much themagnitude of the host coefficients can be altered without thewatermark becoming visible. In our approach, informationfrom level 2 of the DT CWT decomposition is used to createa rough representation of the magnitudes of the higher levelcoefficients (levels 3 and 4). This coarse information will beused to create the perceptual masks. A large coefficient canendure a more significant change than a small one. Thus, theseperceptual masks provide an estimate of the strength that can beused to embed the watermark in every coefficient of levels 3 and4. The elements in these masks will be used as weights duringthe embedding process. Since the watermark is not embeddedin the coefficients of level 2, the masks can be retrieved at thedecoder without losing any information (provided the videoframes have not been distorted in any way).

Considering one frame at a time, a perceptual mask is createdfor each of the six subbands of level 3. In order to obtain thesemasks, we apply a low-pass filter to the magnitudes of everylevel 2 subband and then downsample the re-sulting arrays by a factor of 2. The elements of these arrays aredivided by a step value and then rounded to the next higherinteger value (this operation is represented by the symbol ).The resulting arrays have the same di-mensions as the level 3 subbands. This process is described as

for

(2)

where and are the 2-D array

formed of the magnitudes of the complex elements of asfollows:

.... . .

...

(3)

The masks for the level 4 subbands are created in a similarway. The low-pass filter is applied to the magnitudes ofevery level 2 subband and then the resultingarrays are downsampled by a factor of 2. The same process oflow-pass filtering and downsampling is applied again to thesearrays. The elements of the resulting arrays are divided by thestep value and then rounded to the next higher integer value.These new arrays become the masksfor the level 4 subbands and have the same dimensions as thesesubbands. The process is described in (4) and illustrated in Fig. 6

for (4)

2) Adding the Watermark: For each frame, the watermark’scomplex high-frequency coefficients areadded to the magnitudes of the coefficients of level 3 andlevel 4 ( and , respectively).Since the number of watermark coefficients is the same as thenumber of coefficients of level 4, the data will be embedded

Page 5: 466 IEEE TRANSACTIONS ON INFORMATION FORENSICS AND ...linoc/lino_coria_IEEE_IFS_2008.pdf · 466 IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, VOL. 3, NO. 3, SEPTEMBER 2008

470 IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, VOL. 3, NO. 3, SEPTEMBER 2008

Fig. 6. Construction of the masks for coefficients from levels 3 and 4 are ob-tained from level 2 subbands.

in this level by weighting the watermark coefficients with thecorresponding mask and multiplying by a scalar factor . Then,the resulting array is added to the magnitudes of the frame’slevel 4 coefficients. In the case of level 3, the number of framecoefficients in every subband is four times the number of water-mark coefficients in that subband. The watermark coefficientsare embedded by adding them to the magnitudes of those ofeach corresponding mask that are weighted and multiplied bythe scalar . The resulting array is added to the magnitudes ofthe level 3 coefficients. These are described next

for (5)

for (6)

where and are the 2-D arrays formed of the mag-nitudes of the complex elements of and as follows:

.... . .

... (7)

.... . .

... (8)

for . and are the dimensions of the videoframe in pixels.

and are 2-D arrays formed with the phase ofthe complex elements of and as follows:

.... . .

... (9)

.... . .

... (10)

for .

The symbol denotes the element-wise matrix product andthe value is a strength parameter that is greater than zero andis used to control the fidelity impact of the watermark.

Once and are obtained, they replace andwhen computing the inverse DT CWT that provides the

watermarked frame.

D. Decoding the Watermark

The decoding process is blind, that is, the watermark isdecoded without relying on any information from the originalvideo file. Essentially, the decoder performs the inverse opera-tions of the encoder. For each frame of the watermarked videosequence, the four-level DT CWT is applied. The masks forlevels 3 and 4 are obtained via (2) and (4), respectively. Thearrays andare then obtained in the following way:

.... . .

...

(11)

for , 4 and . and are the dimensionsof the video frame in pixels.

The watermarked level 3 and level 4 coefficients andare multiplied by the arrays in order to compen-

sate for the different weights associated with every coefficientduring the watermark embedding process

for and(12)

Next, , the level-1 DT CWT representation of the decodedwatermark , is obtained. Since the low-pass componentwas not encoded in the watermarked video sequence, isconsidered to be an array of zeros. However, the six subbandswith the details can be estimated as follows:

.... . .

...

.... . .

...

.... . .

...

.... . .

...

.... . .

...

(13)

Page 6: 466 IEEE TRANSACTIONS ON INFORMATION FORENSICS AND ...linoc/lino_coria_IEEE_IFS_2008.pdf · 466 IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, VOL. 3, NO. 3, SEPTEMBER 2008

CORIA et al.: VIDEO WATERMARKING SCHEME BASED ON THE DT CWT 471

for . and are the dimensions of the videoframe in pixels.

The inverse DT CWT is applied and the resulting 2-D arrayis correlated with , an array that is obtained by applying

DT CWT to the original watermark (which can be obtainedvia the same process that was used at the encoder), discarding

, and then applying the inverse DT CWT. The normalizedcorrelation between and is computed for every frame andthe resulting values are added until a certain amount of framesis reached (100 or more is recommended). When a watermarkis decoded from every frame, the additional normalized corre-lation will be a relatively high number when compared to thenormalized correlation value obtained after an unwatermarkedvideo sequence has gone through the decoder. By looking at theresulting strength value, a decision can be made at the decoderas to whether the video sequence that has gone through the de-coding process has a watermark embedded.

III. EXPERIMENTAL RESULTS

To test the proposed watermarking method, we employedten QCIF (176 144) video sequences. Each consists of 300frames. Five of them were the standard video files Container,Hall Monitor, Mother and Daughter, News, and Suzie. The otherfive were formed using different short-frame sequences fromthese standard video files. The formed sequences had a changeof scene every 30 frames. Five different watermarked video se-quences were created for each of the ten sequences by using adifferent key each time. For every 15 frames, different

’s were used. The watermarks were embedded in the lumi-nance components. Tests were performed on each of the 105 50 sequences. For each sequence, 300 frames were used todetermine the strength of the watermark. For our tests, wasset to 15 so that the average peak signal-to-noise ratio (PSNR)of the watermarked frames was 41 dB. Computer experimentsshowed that setting to 25 yielded the best results.

In order to study the performance of our method, we com-pared our results against two algorithms that employ the regularDWT. These two methods also have the advantages over othersin that they are blind and computationally less demanding. Thus,the original content is not needed to retrieve the watermark andthe frames do not need to be geometrically restored before de-tecting the watermark. The first method we use as reference isbasically the same algorithm as proposed in this paper exceptDWT replaces DT CWT. We will refer to this method as DWT1.The second method is the one presented in [9], which is alsobased on DWT. In this method, which we denote as DWT2,video frames are watermarked by replacing the values of somecoefficients of the frames by the highest or lowest values ofthe neighbor coefficients (depending if a 0 or a 1 is being em-bedded). The average PSNR of the sequences watermarked withthese methods was also set to 41 dB.

The watermark decoder measures the normalized correlationbetween the transform coefficients of every frame and the coef-ficients of the watermark. The measured normalized correlationfor a particular frame will be very small but, over several frames(300 for these tests), the accumulation of the correlations willprovide an indication as to whether the video sequence is wa-termarked.

Fig. 7. Probability of false positives for different values of � . (a) Case � �

������. (b) Case � � ����. In both cases, a value of � is chosen so that �is equal to �� .

We establish a probability of false positives of forall three methods. This probability can be estimated by treatingthe distribution of the normalized correlation as a Gaussian withstandard deviation , where is the size of the watermarkarray. As described in [17], for relatively low thresholds, theGaussian method is an accurate approximation of the actualand is computed using the following equation:

(14)

where is the detection threshold, a positive number smallerthan 1. Anytime the normalized correlation exceeds thisthreshold, the decoder assumes that a watermark is present and,thus, the content is not displayed.

For our method and DWT1, watermark arrays are 64 timessmaller than the size of each frame. If we consider that 300frames of every QCIF (176 144) sequence are examined be-fore reaching a decision, then, for these two methods,

. In the case of DWT2, the wa-termark vector has always six elements per frame, which makes

. Also, since we have set to the value of forall three watermarking schemes, the detection threshold mustbe equal to 0.01 for both DT CWT and DWT1. For the case ofDWT2, is equal to 0.0815. This can be seen in Fig. 7.

For each method, the normalized correlation was computedfor watermarked sequences that had not gone through any typeof distortion. The three watermarking schemes were able to cor-rectly detect all of the watermarks. The results obtained usingthe three methods are shown in Table I.

We then tested the robustness of our method to common dis-tortions. In one experiment, watermarks were decoded after thevideo sequences had gone through some scaling and croppingdistortions. For the second test, the video sequences were ro-tated by a few degrees and the watermark was later decoded. Wealso tested the effects of lossy compression. Finally, all of these

Page 7: 466 IEEE TRANSACTIONS ON INFORMATION FORENSICS AND ...linoc/lino_coria_IEEE_IFS_2008.pdf · 466 IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, VOL. 3, NO. 3, SEPTEMBER 2008

472 IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, VOL. 3, NO. 3, SEPTEMBER 2008

TABLE ICOMPARISON OF NORMALIZED CORRELATION VALUES FOR THREE WATERMARKING METHODS: DT CWT, DWT1, AND DWT2. WATERMARKED SEQUENCES ARE

SUBJECTED TO SCALING (BY 5%, 10%, AND 15%), ROTATION (BY 3 , 6 and 9 ), CROPPING, H.264 COMPRESSION WITH A �� � ��, AND A JOINT ATTACK

THAT INVOLVES SCALING (BY 5%), ROTATION (BY 6 ), CROPPING, AND COMPRESSION

distortions: scaling, rotation, cropping, and lossy compressionwere put together as a joint attack.

A. Frame Scaling and Cropping

We examined the robustness of all three methods when thewatermarked sequences were subjected to scaling and cropping.Every video sequence was scaled up by 5%, 10%, and 15%using bicubic interpolation. The frames were later cropped to fittheir original size (176 144). A visual example of this processcan be seen in Fig. 8. Results are summarized in Table I.

From these results, we notice that DT CWT is able to with-stand a scaling and cropping attack, particularly for scales of5% and 10%. DWT2, however, performs better than the otherschemes for this type of attack.

B. Frame Rotation

Robustness to frame rotation was then tested. Each frame wasrotated counterclockwise by 3 , 6 , and 9 . Bilinear interpola-tion was employed and the resulting images were cropped to fitthe QCIF format. An example of this attack can be seen in Fig. 9.Table I shows comparisons of the performance by the three wa-termarking methods to this particular type of distortion.

It can be observed that our proposed scheme DT CWT is morerobust to rotation than the other two methods. Although DTW1is able to decode 100% of the watermarks when the frames arerotated by 3 , the scheme can only recover 30% of the water-marks once rotation has increased to 6 . DWT2 offers very poorperformance for this particular type of distortion.

C. Compression

In order to test the robustness of the proposed scheme to com-pression, we encoded the video sequences using H.264/AVC.Every 15th frame was set to be an -frame and the rest werechosen to be -frames. The quantization parameter QP for both

and frames was set to 15, which results in a compressionFig. 8. Watermarked frame of the sequence Suzie is scaled and then cropped:(a) 5%, (b) 10%, and (c) 15% scaling.

Page 8: 466 IEEE TRANSACTIONS ON INFORMATION FORENSICS AND ...linoc/lino_coria_IEEE_IFS_2008.pdf · 466 IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, VOL. 3, NO. 3, SEPTEMBER 2008

CORIA et al.: VIDEO WATERMARKING SCHEME BASED ON THE DT CWT 473

Fig. 9. Watermarked frame of the sequence Suzie is rotated and then cropped.(a) 3 , (b) 6 , and (c) 9 rotation.

ratio of around 40 : 1. In this instance, the three watermarkingmethods demonstrated robustness to compression since all of thewatermarks were decoded. The results are summarized in Table Iand an example of a compressed frame can be seen in Fig. 10(a).

D. Joint Attack

The final experiment involved all of the previous attacks to-gether. For this joint attack, we scaled the video frames by 5%and rotated them by 5 . The frames were later cropped to fit theiroriginal size (176 144) and H.264/AVC was used to compressthe video sequences (the same compression ratio as before). Anexample of a video frame that has gone through this joint attackcan be seen in Fig. 10(b). The performance of the three methodscan be revised in Table I.

Results for the DT CWT indicate that the method can suc-cessfully survive a joint attack. Ninety-two percent of the water-marks were detected even though the video sequences had gonethrough scaling, rotation, cropping, and compression. DWT1and DWT2 are not able to withstand a joint attack. Only 20% ofthe watermarks were detected using DWT1. When DWT2 wasemployed, only 4% of the watermarks were recovered.

Fig. 10. Watermarked frame of the sequence Suzie is (a) compressed withH.264 (QF of 15) and (b) subjected to a joint attack (compression, scaling upby 5%, rotating by 5 , and cropping).

IV. CONCLUSION

A new video watermarking algorithm for playback controlthat takes advantage of the properties of the DT CWT is intro-duced. This transform maintains the advantages but avoids theshortcomings of regular wavelets. DT CWT provides importantfeatures, such as perfect reconstruction, shift invariance, andgood directional selectivity. Our method relies on these char-acteristics to create a watermark that offers some robustness togeometric distortions.

The watermark was embedded using information from thesource content in order to keep distortion to a minimum (41 dB).The robustness of our method was tested against several attacks,which included lossy compression, rotation, scaling, cropping,and a joint attack which involved a combination of all previousdistortions. The joint attack was employed to simulate a videosequence that has been recorded from a movie screen witha handheld camcorder and then stored in a digital form. Ourmethod successfully detected the presence of the watermarksin 92% of the corrupted video sequences.

In order to compare the performance of our scheme and eval-uate the advantages of using DT CWT as a watermarking tool,we subjected two DWT-based watermarking algorithms to thesame fidelity standards and the same attacks. Although thesemethods were robust to compression, they did not survive thejoint attack.

Our proposed method is simple to implement; this is impor-tant when considering the additional cost and complexity toDVD players. Furthermore, it is robust to lossy compression andsome geometric distortions. All of these characteristics makeour algorithm suitable for the playback control of digital video.

Page 9: 466 IEEE TRANSACTIONS ON INFORMATION FORENSICS AND ...linoc/lino_coria_IEEE_IFS_2008.pdf · 466 IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, VOL. 3, NO. 3, SEPTEMBER 2008

474 IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, VOL. 3, NO. 3, SEPTEMBER 2008

ACKNOWLEDGMENT

The authors would like to thank Dr. N. Kingsbury forproviding the software to perform the DT CWT transformoperations.

REFERENCES

[1] Mot. Picture Assoc. Amer., 2007. [Online]. Available: http://www.mpaa.org/piracy.asp.

[2] P. B. Schneck, “Persistent access control to prevent piracy of digitalinformation,” Proc. IEEE, vol. 87, no. 7, pp. 1239–1249, Jul. 1999.

[3] I. J. Cox, M. L. Miller, and J. A. Bloom, Digital Watermarking. SanFrancisco, CA: Morgan Kaufmann, 2002.

[4] J. J. K. O’Ruanaidh and T. Pun, “Rotation, scale and translation in-variant digital image watermarking,” in Proc. Int. Conf. Image Pro-cessing, 1997, pp. 536–539.

[5] C.-Y. Lin, M. Wu, J. A. Bloom, I. J. Cox, M. L. Miller, and Y. M. Lui,“Rotation, scale, and translation resilient watermarking for images,”IEEE Trans. Image Process., vol. 10, no. 5, pp. 767–782, May 2001.

[6] C. V. Serdean, M. A. Ambroze, M. Tomlinson, and J. G. Wade, “DWT-based high-capacity blind video watermarking, invariant to geometricalattacks,” Proc. Inst. Elect. Eng., Vis., Image Signal Process., vol. 150,pp. 51–58, Feb. 2003.

[7] P. Bas, J. M. Chassery, and B. Macq, “Geometrically invariant water-marking using feature points,” IEEE Trans. Image Process., vol. 11, no.9, pp. 1014–1028, Sep. 2002.

[8] Z. Dawei, C. Guanrong, and L. Wenbo, “A chaos-based robust wavelet-domain watermarking algorithm,” Chaos, Solitons Fractals, vol. 22, pp.47–54, 2004/10.

[9] P. W. Chan, M. R. Lyu, and R. T. Chin, “A novel scheme for hybriddigital video watermarking: Approach, evaluation and experimenta-tion,” IEEE Trans. Circuits Syst. Video Technol., vol. 15, no. 12, pp.1638–1649, Dec. 2005.

[10] N. Kingsbury, “Image processing with complex wavelets,” Philos.Trans. Math., Phys., Eng. Sci., vol. 357, p. 2543, 1999.

[11] P. Loo and N. Kingsburry, “Digital watermarking using complexwavelets,” in Proc. Int. Conf. Image Processing, 2000, pp. 29–32.

[12] P. Loo and N. Kingsbury, “Digital watermarking with complexwavelets,” in Proc. Inst. Elect. Eng. Seminar Secure Images ImageAuthentication, 2000, pp. 10/1–10/7.

[13] N. Terzija and W. Geisselhardt, “Digital image watermarking usingcomplex wavelet transform,” in Proc. Workshop Multimedia Security,2004, pp. 193–198.

[14] M. Pickering, L. E. Coria, and P. Nasiopoulos, “A novel blind videowatermarking scheme for access control using complex wavelets,” inProc. Int. Conf. Consum. Electron., 2007, pp. 1–2.

[15] N. Kingsbury, “The dual-tree complex wavelet transform: A new tech-nique for shift invariance and directional filters,” presented at the IEEEDSP Workshop, Bryce Canyon, UT, 1998, Paper no. 86.

[16] E. T. Lin and E. J. Delp, “Temporal synchronization in video water-marking,” IEEE Trans. Signal Process., vol. 52, no. 10, pp. 3007–3022,Oct. 2004.

[17] M. L. Miller and J. A. Bloom, “Computing the probability of falsewatermark detection,” in Proc. 3rd Int. Workshop Information Hiding,1999, pp. 146–158.

Lino E. Coria (M’08) was born in Morelia, Mexico.He received the Bachelor’s degree in electronics en-gineering from the Instituto Tecnológico de Morelia,Morelia, Mich., Mexico, in 1996, the M.Sc. degreein electrical engineering from McMaster University,Hamilton, ON, Canada, in 1998, and the Ph.D. degreein electrical and computer engineering from the Uni-versity of British Columbia, Vancouver, BC, Canada,in 2008.

Currently, he is a Professor in the Department ofElectronics, Systems and Informatics with the Insti-

tuto Tecnológico y de Estudios Superiores de Occidente (ITESO), Tlaquepaque,Jalisco, Mexico. His interests include image and video processing and his re-search has mainly focused on digital watermarking.

Dr. Coria was awarded two scholarships for his graduate studies (1996–1998and 2003–2007) by the Mexican Council for Science and Technology(CONACYT). He is a member of ACM.

Mark R. Pickering (M’95) was born in Biloela,Australia, in 1966. He received the B.Eng. degreefrom Capricornia Institute of Advanced Education,Rockhampton, Australia, in 1988 and the M.Eng.and Ph.D. degrees in electrical engineering from theUniversity of New South Wales, Sydney, Australia,in 1991 and 1995, respectively.

Currently, he is a Senior Lecturer in the Schoolof Information Technology and Electrical Engi-neering at the University College, University of NewSouth Wales, Australian Defense Force Academy,

Canberra, Australia. His research interests include video and audio coding,medical imaging, data compression, information security, data networks, anderror-resilient data transmission.

Panos Nasiopoulos (M’95) received the Bachelor’sdegree in physics from the Aristotle University ofThessaloniki, Thessaloniki, Greece, and the B.Sc.,M.Sc., and Ph.D. degrees in electrical and computerengineering from the University of British Columbia(UBC), Vancouver, BC, Canada.

Currently, he is an Associate Professor in theDepartment of Electrical and Computer Engineeringwith the UBC, the holder of the Professorship inDigital Multimedia, and the Director of the Master ofSoftware Systems Program at UBC. Before joining

UBC, he was the President of Daikin Comtec US and the Executive VicePresident of Sonic Solutions. He was voted as one of the most influential DVDexecutives in the world. He is recognized as a leading authority on DVD andmultimedia and has published numerous papers on the subjects of digital videocompression and communications.

Dr. Nasiopoulos has been an active member of ACM, the Standards Councilof Canada (ISO/ITU and MPEG), and the In-Flight-Entertainment committee.He has organized and chaired numerous conferences and seminars and is a fea-tured speaker at multimedia/DVD conferences worldwide.

Rabab Kreidieh Ward (F’99) was born in Beirut,Lebanon. She received the Bachelor’s degree inelectrical engineering from the University of Cairo,Cairo, Egypt, in 1966 and the M.Sc. and Ph.D.degrees from the University of California, Berkeley,in 1969 and 1972, respectively.

She has made significant research contributions indigital signal processing and its applications to cableTV, high-definition TV, video compression and med-ical images, including mammography, microscopy,and cell images. Her research ideas have been trans-

ferred to industry.Dr. Ward is the recipient of the “Society Award” of the IEEE Signal Pro-

cessing Society, the R.A. McLachlan Memorial Award, the Association of Pro-fessional Engineers and Geoscientists of British Columbia, and the UBC KillamResearch Prize. She is a Fellow of the Royal Society of Canada, the CanadianAcademy of Engineers, and the Engineering Institute of Canada. She was theChair of the IEEE International Conference on Image Processing 2000 and IEEEISSPIT, and was the Vice Chair of IEEE ISCAS 2004.