Robust Transmission of SPIHT-Coded Images Over Packet …

11
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 17, NO. x, mon. 2007. 1 Robust Transmission of SPIHT-Coded Images Over Packet Networks Shih-Hsuan Yang, Senior Member, IEEE, and Po-Feng Cheng Abstract—A novel hybrid error-resilience and error-concealment technique for embedded wavelet coders is presented. Aimed to resolve data loss in real-time visual transmission over the packet erasure channels, the proposed method incorporates data partitioning and multiple description coding (MDC) into the SPIHT’s encoding process. Each of the spatial-orientation trees of SPIHT is independently coded and packetized with multiple descriptions of important wavelet coefficients. At decoding, the coefficients that cannot be recovered are predicted through linear interpolation. The estimation is based on either intraband or interband correlation among wavelet coefficients. Experimental results show that the proposed method achieves good and stable error performance with low additional redundancy. Index Terms—Error concealment, multiple description coding, robust video coding, SPIHT. I. INTRODUCTION ULTIMEDIA information proliferating on the Internet and 3G communication networks is paving the way for universal multimedia access (UMA). One important issue for UMA is to develop efficient robustness schemes for compressed multimedia. The most successful image coders are transformation based. Discrete cosine transform (DCT) and discrete wavelet transform (DWT) are the common transformations for image coding. For example, the classic baseline JPEG coder employs DCT to generate energy-compacted spectral components. Shapiro’s embedded zerotree wavelets (EZW) [1] made a major breakthrough under the transform-coding framework. EZW treats a set of insignificant wavelet coefficients corresponding to the same orientation and spatial location as a single zerotree symbol. It generates highly compressed and scalable bitstreams, and is thus adequate for multimedia applications. Of the various improvements of EZW, the set partitioning in hierarchical trees (SPIHT) coding [2] has even distinguished compression performance and elegant implementation, and has become the yardstick of all new image-coding algorithms. Manuscript received April 1, 2004. This work was supported by the National Science Council, R.O.China, under Grant NSC 92-2213-E-027-033. Shih-Hsuan Yang is with the National Taipei University of Technology, Taipei, Taiwan. (phone: +886-2-27712171 ext. 4211; fax: +886-2-87732945; e-mail: [email protected]). Po-Feng Cheng was with the National Taipei University of Technology, Taipei, Taiwan. He is now with the Industrial Technology Research Institute, Taiwan. (e-mail: [email protected]). The success of SPIHT in compression is attributed to the formation of zerotrees and the use of variable-length codes. However, the coded images have fragile visual quality when transmitted across an unreliable link because erroneous data usually causes severe error propagation [3]. For a visual communication with no feedback mechanism, error control can be performed in three ways: 1) forward error correction (FEC), 2) error-resilient coding at encoders, and 3) error concealment at decoders. Many of these techniques have been tailored to protect SPIHT-coded images. Sherwood and Zeger [4] proposed a concatenated FEC scheme with a rate-compatible punctured convolutional (RCPC) code as the inner code and a cyclic redundancy check (CRC) code as the outer code. Man et al. [5] modified the SPIHT’s encoding procedure to generate fixed-length data segments in conjunction with RCPC codes for unequal error protection. Creusere [6] divided the wavelet coefficients into disjoint trees for error isolation and interleaved the generated bit stream to produce scalable data. Yang and Cheng [7] proposed an error-resilient scheme by partitioning the coded data sequence and adding appropriate side information. Kim et al. [8] proposed separating low-frequency subbands with high-frequency subbands and applied adaptive packetization for better error resilience. Multiple description coding (MDC) schemes for SPIHT were proposed in [9]-[11]. Internet and wireless networks have offered ubiquitous channels for visual communication. The transport environment of these networks, however, is not always reliable [12]. Under an IP/UDP/RTP real-time transmission structure, packet loss may frequently occur during peak time because of network congestion (buffer full). In the case of wireless communication where FEC is typically used at the lower layers of the system, severe channel impairment such as deep fade may cause erasures of data frames. In this paper, we consider a new error-control framework for SPIHT-coded images sent across a memoryless bit-error-free but packet-lossy link. The overall system is depicted in Fig. 1, where the new functions added to original SPIHT coding, namely MDC and packetization at the encoder side and error concealment at the decoder side, are highlighted in shadowed boxes. Important wavelet coefficients are duplicated, and packetization is achieved by independent coding each of the directional wavelet branches. When a data packet is lost, its damage will be restricted to one directional component of a local area and the essential information of the packet may be recovered from correctly received packets. The M Copyright (c) 2007 IEEE. Personal use of this material is permitted. However, permission to use this material for any other purposes must be obtained from the IEEE by sending an email to [email protected].

Transcript of Robust Transmission of SPIHT-Coded Images Over Packet …

Page 1: Robust Transmission of SPIHT-Coded Images Over Packet …

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 17, NO. x, mon. 2007. 1

Robust Transmission of SPIHT-Coded Images Over Packet Networks

Shih-Hsuan Yang, Senior Member, IEEE, and Po-Feng Cheng

Abstract—A novel hybrid error-resilience and

error-concealment technique for embedded wavelet coders is presented. Aimed to resolve data loss in real-time visual transmission over the packet erasure channels, the proposed method incorporates data partitioning and multiple description coding (MDC) into the SPIHT’s encoding process. Each of the spatial-orientation trees of SPIHT is independently coded and packetized with multiple descriptions of important wavelet coefficients. At decoding, the coefficients that cannot be recovered are predicted through linear interpolation. The estimation is based on either intraband or interband correlation among wavelet coefficients. Experimental results show that the proposed method achieves good and stable error performance with low additional redundancy.

Index Terms—Error concealment, multiple description coding, robust video coding, SPIHT.

I. INTRODUCTION ULTIMEDIA information proliferating on the Internet and 3G communication networks is paving the way for

universal multimedia access (UMA). One important issue for UMA is to develop efficient robustness schemes for compressed multimedia. The most successful image coders are transformation based. Discrete cosine transform (DCT) and discrete wavelet transform (DWT) are the common transformations for image coding. For example, the classic baseline JPEG coder employs DCT to generate energy-compacted spectral components. Shapiro’s embedded zerotree wavelets (EZW) [1] made a major breakthrough under the transform-coding framework. EZW treats a set of insignificant wavelet coefficients corresponding to the same orientation and spatial location as a single zerotree symbol. It generates highly compressed and scalable bitstreams, and is thus adequate for multimedia applications. Of the various improvements of EZW, the set partitioning in hierarchical trees (SPIHT) coding [2] has even distinguished compression performance and elegant implementation, and has become the

yardstick of all new image-coding algorithms.

Manuscript received April 1, 2004. This work was supported by the National

Science Council, R.O.China, under Grant NSC 92-2213-E-027-033. Shih-Hsuan Yang is with the National Taipei University of Technology,

Taipei, Taiwan. (phone: +886-2-27712171 ext. 4211; fax: +886-2-87732945; e-mail: [email protected]).

Po-Feng Cheng was with the National Taipei University of Technology, Taipei, Taiwan. He is now with the Industrial Technology Research Institute, Taiwan. (e-mail: [email protected]).

The success of SPIHT in compression is attributed to the formation of zerotrees and the use of variable-length codes. However, the coded images have fragile visual quality when transmitted across an unreliable link because erroneous data usually causes severe error propagation [3]. For a visual communication with no feedback mechanism, error control can be performed in three ways: 1) forward error correction (FEC), 2) error-resilient coding at encoders, and 3) error concealment at decoders. Many of these techniques have been tailored to protect SPIHT-coded images. Sherwood and Zeger [4] proposed a concatenated FEC scheme with a rate-compatible punctured convolutional (RCPC) code as the inner code and a cyclic redundancy check (CRC) code as the outer code. Man et al. [5] modified the SPIHT’s encoding procedure to generate fixed-length data segments in conjunction with RCPC codes for unequal error protection. Creusere [6] divided the wavelet coefficients into disjoint trees for error isolation and interleaved the generated bit stream to produce scalable data. Yang and Cheng [7] proposed an error-resilient scheme by partitioning the coded data sequence and adding appropriate side information. Kim et al. [8] proposed separating low-frequency subbands with high-frequency subbands and applied adaptive packetization for better error resilience. Multiple description coding (MDC) schemes for SPIHT were proposed in [9]-[11].

Internet and wireless networks have offered ubiquitous channels for visual communication. The transport environment of these networks, however, is not always reliable [12]. Under an IP/UDP/RTP real-time transmission structure, packet loss may frequently occur during peak time because of network congestion (buffer full). In the case of wireless communication where FEC is typically used at the lower layers of the system, severe channel impairment such as deep fade may cause erasures of data frames. In this paper, we consider a new error-control framework for SPIHT-coded images sent across a memoryless bit-error-free but packet-lossy link. The overall system is depicted in Fig. 1, where the new functions added to original SPIHT coding, namely MDC and packetization at the encoder side and error concealment at the decoder side, are highlighted in shadowed boxes. Important wavelet coefficients are duplicated, and packetization is achieved by independent coding each of the directional wavelet branches. When a data packet is lost, its damage will be restricted to one directional component of a local area and the essential information of the packet may be recovered from correctly received packets. The

M

Copyright (c) 2007 IEEE. Personal use of this material is permitted. However, permission to use this material for any other purposes must be obtained from the IEEE by sending an email to [email protected].

Page 2: Robust Transmission of SPIHT-Coded Images Over Packet …

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 17, NO. x, mon. 2007. 2

other distinguished feature of the proposed scheme is the incorporation of error concealment into the system. Intra-subband or inter-subband correlation is exploited to predict the coefficients in the largest-scale detail subband. Simulation results are given and the analyses are presented along with various factors relevant to the system’s performance.

The rest of this paper is organized as follows. The MDC and packetization strategies are described in Section II. Reconstruction methods of the lost coefficients through linear estimation are introduced in Section III. The simulation results are given in Section IV, followed by the concluding remarks.

Wavelet Transform SPIHT Quantization

SPIHT InverseQuantization

Inverse WaveletTransform Error Concealment

Packet-LossChannel

MDC &Packetization

Encoder

Decoder Fig. 1. Framework of the proposed method.

II. PACKETIZATION AND MDC FOR ERROR-RESILIENT SPIHT

A. Review of the SPIHT Coding SPIHT employs a pyramidal DWT to generate

energy-condensed coefficients. The Daubechies (9,7) [13] and the LeGall 5/3 filters [14] are considered in this paper because of their excellent performance [15]. A two-level decomposition of the Lena image is shown in Fig. 2(a), which manifests the properties of self-similarity and energy-compaction across multiresolution scales. A spatial-orientation tree shown in Fig. 2(b) is thus defined, which groups the wavelet coefficients corresponding to the same area and orientation (horizontal, vertical, or diagonal). Except for the isolated nodes in the DC (largest-scale approximation) subband, each of the non-terminal nodes has exactly four children. To meet the required rate constraint, the resulting DWT coefficients will be further quantized and entropy coded. The overall SPIHT encoding process is depicted in Fig. 3. After a p-level wavelet decomposition (p = 5 in this paper), a two-pass scalar deadzone quantization iteratively performs on the spatial-orientation trees. The first pass, sorting pass, determines the “significance map”; it identifies and gives signs to the significant coefficients with respect to a threshold. Insignificant spatial-orientation trees (and subtrees) and isolated insignificant coefficients are recorded as well. The second pass, refinement pass, gives an additional bit of precision to the significant coefficients. Arithmetic coding can be applied to the quantization symbol stream for further data compaction. However, it is observed [2] that the added entropy coding provides limited improvement in rate reduction while introducing intensive computation and possible error propagation. We have thus dispensed with entropy coding in this paper.

(a)

*

(b)

Fig. 2. (a) Wavelet transform (b) SPIHT’s spatial-orientation trees.

OriginalImage DWT

Sorting Pass RefinementPass

EntropyCoding

Bit Streams

SPIHTQuantization

Fig. 3. SPIHT coding process.

B. Data Partitioning of the SPIHT Bitstreams Three categories of bits are generated during SPIHT

quantization, the significance bits (whether an entry in LIS or LIP becomes significant), the sign bits, and the refinement bits

Page 3: Robust Transmission of SPIHT-Coded Images Over Packet …

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 17, NO. x, mon. 2007. 3

(of significant coefficients). Because of the predictive nature and the use of variable-length coding, successful SPIHT decoding depends not only on the current bits, but also on previous significance bits. A single significance-bit error would cause catastrophic error propagation. Adequate data partitioning and packetization schemes that isolate the damage of a packet loss are thus required for visual applications. The error-resilient data partitioning manages to minimize the perceptual degradation under a packet-loss environment.

Packetization can be classified into two types, fixed-size and variable-size. The packetizable zerotree wavelet (PZW) packetization scheme [16] and its optimization version (optimal packetization, OP [17]) considered grouping the SPIHT-generated bitstreams into fixed 48-byte payload for the ATM networks. In contrast, the CZWAP algorithm proposed in [8] adaptively grouped wavelet trees into variable-size packets. Note that a variable-size packet network is not uncommon; for example, visual transmission on the Internet may take place over the UDP/IP/PPP protocol with variable-size packets [12]. The number of bits for encoding a SPIHT’s spatial-orientation tree is found to vary widely. We consider in this study a variable-size packetization scheme where wavelet branches of three different orientations, horizontal (H), vertical (V), and diagonal (D), are independently encoded and packetized (shown in Fig. 4(a)). With a 5-level wavelet decomposition for 512×512 gray-scale images, 192 independent variable-size data packets are produced. We give the statistics on the packet size for five different SPIHT’s coding rates in Table 1. The two test images, Lena and Baboon, are shown in Fig. 5. An adaptation of the proposed variable-size data partitioning to fixed-size packetization can be done by adequately grouping the spatial-orientation trees. In such a case, extra header information (indicating the starting tree and the number of trees in a packet), a randomized scanning order, and a bit stuffing or truncation mechanism, should be incorporated. We retain the basic variable-size data partitioning scheme in this paper to manifest the MDC/interpolation gain from the correlated wavelet coefficients.

TABLE I

PACKET SIZE (IN BITS) FOR LENA AND BABOON UNDER 5-LEVEL DECOMPOSITION WITH THE 9/7 FILTER

bpp 0.125 0.20 0.25 0.5 1.0

Average 171 273 341 683 1366 Max. 689 1260 1621 3105 4824 Lena Min. 43 44 48 59 130

Average 171 273 341 683 1366 Baboon Max. 772 1356 1773 3354 4628

Min. 39 39 39 49 115

I1 H1

V1

HorizontalBranch

VerticalBranch

DiagonalBranch

D1

Isolated Root

D2

H2

V2

H3

V3D3

(a)

H1

HorizontalBranch

VerticalBranch

DiagonalBranch

H2

V2

H3

V3

D3

D1

I1

V1

H1

D2

D1

I1

V1

H1

D1

I1

V1

(b)

Fig. 4. (a) Data partitioning (b) MDC of the proposed method.

(a)

Page 4: Robust Transmission of SPIHT-Coded Images Over Packet …

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 17, NO. x, mon. 2007. 4

(b)

Fig. 5. Test images: (a) Lena and (b) Baboon.

C. Multiple Description Coding (MDC) A multiple description encoder produces a set of descriptions

such that the decoder may provide a reasonable reconstruction from any subset of the descriptions. MDC usually provides more graceful degradation than conventional error-control techniques. Under the circumstances that all packets are equally likely to be dropped with a varying drop rate, MDC is more preferable than the layer coding [18]. Miguel et al. [9] presented a multiple-description SPIHT (MD-SPIHT) scheme where partially coded redundant trees were packed in multiple descriptions. A redundancy allocation algorithm was employed to find the optimal bit rate assignment for each tree. Researchers have also proposed more sophisticated MDC schemes than the naive duplication [10], [11], [19], [20].

Fig. 6 shows the damages caused by 10% packet losses under the proposed data-partitioning scheme, where all the lost coefficients are filled with 0. It should be noted that each lost packet corresponds to 1/3 information of the 1/64 total area. Among all the coded data, the DC (level 1) information is by far the most important. Occupying only (1/4)5 < 0.1% in the total number of wavelet coefficients, this subband dominates the energy of an image. More than 95% energy resides in this subband for typical images. Moreover, without these DC coefficients as the roots, decoding of the remaining spatial-orientation trees will be dangling. Strong protection for these coefficients is thus needed for guaranteed quality. The proposed MDC scheme is depicted in Fig. 4(b). The DC coefficients including the isolated nodes are duplicated and attached to the other two sibling spatial-orientation trees. As a consequence, three copies of the DC information are individually transmitted across the channel. To facilitate error concealment for the magnitude of level-2 coefficients, the sign bits of level-2 coefficients are duplicated in the another spatial-orientation tree. (Note that the signed coefficients are largely uncorrelated; only their magnitudes are correlated.) The sign bits of H2 (D2, V2) are duplicated in the packets of D2 (V2, H2, respectively). The proposed directional data-partitioning and MDC scheme isolates the damages caused by a lost packet and offers the essential information for interpolating the lost wavelet coefficients.

TABLE II REDUNDANCY (IN BITS AND %) OF THE PROPOSED MDC SCHEME

bpp 0.125 0.20 0.25 0.5 1.0

Level 1 4097, 12.5%

4116, 7.85%

4609, 7.03%

5121, 3.91%

5633, 2.15%Lena

Level 2 768 (2.34%, 1.46%, 1.17%, 0.59%, 0.29%)

Level 1 3584, 10.9%

3584, 6.84%

3584, 5.47%

4096, 3.13%

4608, 1.76%Baboon

Level 2 768 (2.34%, 1.46%, 1.17%, 0.59%, 0.29%)

TABLE III LOSSLESS PERFORMANCE IN PSNR WITHOUT AND WITH MDC

bpp 0.125 0.20 0.25 0.5 1.0 without MDC 30.52 32.60 33.57 36.73 39.92

Lena with MDC 29.91 32.04 33.25 36.57 39.81

without MDC 21.49 22.36 22.87 25.11 28.62

Baboonwith MDC 21.28 22.15 22.70 24.96 28.49

(a)

(b)

Fig. 6. Effects of 10% packet losses. (a) Lena (15.84 dB) and (b) Baboon (14.20 dB). Images are coded at 0.25 bpp.

The amount of redundancy of the proposed MDC scheme

(with the 9/7 filter) is listed in Table 2. For typical SPIHT coding rates between 1/4 and 1/2 bpp, the total added redundancy relative to the total generated bits is less than 10%, and the redundancy of the (fixed-size) level-2 sign bits is even smaller. The MDC redundancy translates into the PSNR

Page 5: Robust Transmission of SPIHT-Coded Images Over Packet …

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 17, NO. x, mon. 2007. 5

degradation in the lossless environment as shown in Table 3. Compared with the MD-SPIHT [9] that incurs 0.60 dB loss at 0.5 bpp for Lena, the PSNR gap for the proposed method is only 0.16 dB. We further quantify the PSNR degradation due to MDC overhead in contrast to the fixed-size packetization schemes PZW [16] and OP [17]. The original embedded SPIHT coding produces a coded Lena of 32.60 dB at 0.20 bpp. With the same coding rate for unpacketized bit streams, PZW and OP render the Lena image of 32.19 dB and 32.20 dB respectively, while the proposed MDC method achieves a comparable quality of 32.04 dB. We will show that this simple duplication MDC scheme, when used in conjunction with the error-concealment technique presented in the next section, provides good and stable error protection.

III. RECONSTRUCTION OF LOST WAVELETS COEFFICIENTS

A. Linear Least-Square Estimation of Lost Coefficients Error concealment relieves the visual degradation by

interpolating the erroneous or lost data at the decoder. Reconstruction can be built from spatially or spectrally correlated samples [21]. Li and Orchard [22] presented a spatial-domain error-concealment technique based on linear least-square error (LLSE) estimation, where the lost blocks were predicted from the surrounding pixels. Rane et al. [23] employed a deterministic least-squares model for tiled wavelet-transformed images. The interpolation was performed with block classification to preserve the edges. Lebeau et al. [24] developed an LLSE estimation for the lost wavelet coefficients. With appropriate interleaving to isolate the damage, the reported spectral interpolation outperformed median filtering by up to 1 dB.

Suppose that the estimate of a lost coefficient is obtained from a linear combination of N observations

nY nY

},,,,{ 11 −+−−= nNnNn XXX LX

(1) ∑=

−=N

kknkn XaY

1

ˆ

To minimize the mean square error (MSE) of the optimal weighting vector is the solution to the Yule-Walker equation [25]

,nY],,[ 21 Naaa L=a

].[ XraR XXX nY YE== (2)

where RXX is the correlation matrix of X. In practical implementations, an Mth-order approximation of RXX and rYX can be calculated from the empirical data as

,1~ HHR XXT

M= uHr X

TY M

1~ = (3)

with and

NMNMnMnMn

Nnnn

Nnnn

XXX

XXXXXX

×−−−−−−

−−−−

−−−−

⎥⎥⎥⎥

⎢⎢⎢⎢

=

L

MOMM

L

L

21

243

132

H

(4)

1

2

1

×−

⎥⎥⎥⎥

⎢⎢⎢⎢

=

MMn

n

n

Y

YY

Mu

In (3) and (4), Yn-1, Yn-2, … , Yn-M are M available samples and the ith row of H represent the observations that are correlated with Yn-i in a similar manner as Xn-1, Xn-2, … , and Xn-N are correlated with Yn.

In this paper, we develop a new DWT-domain statistical error-concealment scheme under the designed error-resilient coding framework. The introduced MDC has protected the level-1 coefficients and signs of level-2 coefficients. The presented interpolation method attempts to restore the magnitude of level-2 coefficients (H2, V2, and D2). Two general correlations among wavelet coefficients, intra-subband and inter-subband, are utilized for linear interpolation. As depicted in Fig. 7, the intraband correlation resides in a subband of the same level and orientation (e.g. V2) while the interband correlation takes the parent-child relationship (e.g. V1 and V2) into consideration. The remaining missing coefficients will be all filled with 0. Since higher-level coefficients have smaller magnitudes and little inter-coefficient correlation, sophisticated interpolation for these coefficients would be unnecessary.

Fig. 7. Correlation among wavelet coefficients [26].

B. Interpolation via Intraband Correlation Although the wavelet transform aims to generate

uncorrelated or even independent samples, nontrivial dependence exists among wavelet coefficients [26]-[29]. The identified correlation can be exploited for image compression, restoration, as well as interpolation, and the success of these applications relies on the accuracy of the correlation model.

Page 6: Robust Transmission of SPIHT-Coded Images Over Packet …

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 17, NO. x, mon. 2007. 6

Buccugrossi and Simoncelli [26] presented a conditional probability model to characterize the magnitudes of related wavelet coefficients. They have discovered that the strongest correlation exists within intraband cousin nodes and interband parent-child nodes. They also showed that the marginal probability of wavelet coefficients within a subband follows a generalized Laplacian rather than a Gaussian distribution. Liu and Moulin [27] investigated the interscale and intrascale dependence between wavelet coefficients in terms of mutual information. Hemami and Gray [28] proposed directional filtering to interpolate the lost wavelet coefficients by intraband prediction. Deever and Hemami [29] analyzed the correlation among the signs of wavelet coefficients under a context-modeling scheme and applied it to efficient sign coding.

The intra-band estimation in our work is made from the intraband cousin nodes. One such example is depicted in Fig. 8. The first node in the I1 subband is indexed as (0,0). Recall that the proposed MDC scheme groups four sibling level-2 wavelet coefficients in a packet. Suppose that a packet containing four H2 wavelet coefficients {(4,22), (4,23), (5,22), (5,23)} is lost. Furthermore, assume that the sign bits of these coefficients are recovered from other successfully received packets. The magnitude of a missing coefficient Yn = (4,23) can be predicted rightward-down from top and left cousin nodes (N = 2) as

(left), || (top)|||ˆ| 2211 −− += nnn XaXaY

i.e., (5) )21,4()23,2()23,4( 21 aa +=

(The absolute value is not indicated when there is no concern of confusion.) We take four surrounding intraband nodes {(2,21), (2,23), (2,25), (4,21)} (M = 4) to form the approximation. The optimal weighting vector is solved from Eqs.

(2)-(4) with and .

],[ 21 aa=a

24)19,4()21,2()23,2()25,0()21,2()23,0()19,2()21,0(

×

⎥⎥⎥⎥

⎢⎢⎢⎢

=H

14)21,4()25,2()23,2()21,2(

×

⎥⎥⎥⎥

⎢⎢⎢⎢

=u

Several issues relevant to intraband estimation are now explained. Instead of the direct adjacent (brother) nodes, the next-to-adjacent (cousin) nodes have been used for prediction. The cousin nodes show larger correlation owing to the phase-shifting phenomenon [30]. Furthermore, the use of cousin nodes avoids the problems of lacking estimating samples. To see this, note that the left brother (4,22) will be missing along with the coefficient (4,23). Moreover, equation (5) considers only one of many possible scanning orders (i.e., the rightward-down scan or called the raster scan) within a subband. We follow the linear merge strategy [22] that combines different scanning orientations to increase the robustness. Estimations from four scanning orders (rightward-down, leftward-up, downward-right, and upward-left) corresponding to cases (1),(4),(5),(8) of Fig. 5 in Ref. [22], respectively, are averaged with equal weight to be the final value.

H2

(2,23)

(4,21)(4,23)

: lost coefficient

(0,16) (0,17)

(1,16) (1,17)

………(2,16) (2,17)

Fig. 8. Intraband estimation. A lost coefficient (4,23) in the H2 subband is linearly interpolated by cousin nodes (2,23) and (4.21).

C. Interpolation via Interband Correlation In addition to intraband correlation, the interband correlation

has been verified in [26] (in terms of cumulative mutual information) and [27] (in terms of mutual information). This nontrivial correlation translates to the parent-child coding gain of approximately 0.40 dB for SPIHT [31]. The empirical correlation coefficient between the magnitudes of level-1 parent nodes and level-2 child nodes (parent-child dependence) are computed and found to be 0.311 and 0.286 for Lena and Baboon, respectively. In contrast, the intraband correlation coefficients among level-2 magnitudes (cousin dependence) are found to be 0.239 and 0.271 for these two images. Furthermore, the proposed MDC scheme offers better protection for the parent nodes (i.e., DC coefficients) required for interband prediction. As a consequence, the interband prediction is potentially better in interpolation performance than its intraband counterpart.

One example of the proposed interband estimation is depicted in Fig. 9. Again suppose that the four H2 wavelet coefficients {(4,22), (4,23), (5,22), (5,23)} are missing due to a packet loss. The missing coefficient Yn = (4,23) can be predicted from its parent node (4,7) (N = 1) as

(parent),|||ˆ| 11 −= nn XaY i.e., (6) )7,4()23,4( 1a=

In our experiment, we take 4 surrounding samples

and . The same linear merge

strategy as the intraband case is applied to combine different scanning orientations. Note that the other lost coefficients such as (4,22) have the same H but a different u, and thus have different estimations.

14)5,4()9,2()7,2()5,2(

×

⎥⎥⎥⎥

⎢⎢⎢⎢

=H

14)21,4()25,2()23,2()21,2(

×

⎥⎥⎥⎥

⎢⎢⎢⎢

=u

Page 7: Robust Transmission of SPIHT-Coded Images Over Packet …

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 17, NO. x, mon. 2007. 7

H2 (4,23)

: lost coefficient

(0,16) (0,17)

(1,16) (1,17)

…(2,16) (2,17)H1

(0,1) (0,3)

(2,1) (2,3)

(4,7)

Fig. 9. Interband estimation. A lost coefficient (4,23) in the H2 subband is linearly interpolated by its parent node (4,7) in the H1 subband.

IV. EXPERIMENTAL RESULTS

A. Performance Evaluation: Overview The benefits of the proposed MDC/error concealment

method were first verified against the fixed-size packetization schemes PZW [16] and OP [17]. PZW and OP, with a target bit rate of 0.20 bpp for the embedded SPIHT bitstream, resulted in the actual bit rate of 0.2095 bpp after packetization. The proposed schemes were also adjusted to the same unpacketized bit rate (0.20 bpp) for making fair comparison. In such a case, an extra 0.059 (= 8*192/512/512) bpp is required for specifying the location of the 192 packets. The average number of lost bits and PSNR results for the coded Lena image are shown in Table 4. With approximately the same number of lost information bits, the proposed method suffers little PSNR degradation (0.16 dB) from the MDC overhead. Nevertheless, significant MDC/error concealment gain is observed when the packet loss rate increases. In particular, our method outperforms the OP scheme by at least 0.42 dB for a packet loss rate no less than 5%.

We next explore the individual benefit offered by the MDC, intraband interpolation, and interband interpolation. Six scenarios are investigated in the following (the last two rows in Table 4(b) correspond to Scenarios 5 and 6). The obtained PSNR values are averaged over 500 experiments to achieve statistical reliability and the lost packets are randomly selected with equal probability.

♦ Scenario 1: lossless (all lost packets are perfectly recovered).

♦ Scenario 2: unprotected (all lost coefficients are filled with 0). This scenario corresponds to the worst case.

♦ Scenario 3: No MDC, and the lost coefficients are estimated by the 4-neighbor average. This scenario

ut MDC. corresponds to typical concealment results witho♦ Scenario 4: MDC + level-2 4-neighbor average. ♦ Scenario 5: MDC + level-2 intraband interpolation. ♦

TABLE IV PERFORMANCE COMPARISONS WITH PZW AND OP (ALL THE EXPERIMENTS HAVING UNPACKETIZED RATES OF 0.20 BPP FOR THE LENA IMAGE).

(a) Average number of lost information bits

1% loss 5% loss 10% loss 20% lossPZW [16] and OP [17] 524.3 2621.4 5242.9 10485.8

Our method 524.7 2624.0 5218.7 10487.5

(b) Average PSNR performance (in dB)

No loss 1% loss 5% loss 10% loss 20% loss

PZW 32.19 31.33 - 26.29 24.63 OP 32.20 31.50 29.36 27.53 25.08

Our method (intra) 32.04 31.62 29.78 28.22 25.91 Our method (inter) 32.04 31.63 29.80 28.25 25.92

The first three scenarios do not incorporate MDC (and the accompanied extra information) and thus have a better lossless performance (see Table 3). For Scenarios 3-6, if some of the 4 neighbors of a lost node are missing, they are disregarded and are not taken into the average; if some of the nodes required for linear prediction are missing, they are replaced with the adjacent available coefficients along the same direction. A zero value will be used if we eventually find no available node up to the boundary. Take Fig. 8 as an example. If the node (4,21) in the vector u is missing, we will examine the availability of (4,19) and (4,17) in sequence. (The node (4,15) no longer resides in the H2 subband.) The required value for (4,21) will be replaced with 0 if all these nodes are lost. A similar search and replacement strategy is used for the possible missing coefficients in H. When a missing DC coefficient cannot be recovered from MDC, the 4-neighbor average of coefficients of the same orientation is taken. When MDC fails to recover the sign of a missing level-2 coefficient, we take a majority vote from its 4 neighbors; positive or negative is picked at random when there is a tie.

We have examined four packet loss rates, 1%, 5%, 10%, and 20%. All the indicated bit rates do not include the extra 0.059 bpp required for packetization unless otherwise specified. The simulation results are given in Tables 5 and 6 for the two test images coded at 0.25 bpp and 1.0 bpp, respectively. Significant MDC gain (up to 4.51 dB at 1.0 bpp and 20% loss) and interpolation gain (up to 1.38 dB at 1.0 bpp and 10% loss) are observed for Lena. The MDC gain grows upon increasing the packet loss rate. Furthermore, the interband interpolation (Scenario 6) performs consistently slightly better than the intraband interpolation (Scenario 5). The MDC and interpolation gain is less significant for Baboon especially at low coding bit rates probably because this image has less significant low-frequency components. Fig. 10 shows a specific realization of the damaged Lena image (10% packet loss at 0.25 bpp) with all the reconstructed pictures. For a better subjective evaluation, an enlarged portion is shown in Fig. 11. The perceptual quality coincides with the objective PSNR evaluation. In the following, we explore other factors relevant to the system’s performance, including the bit rate and the Scenario 6: MDC + level-2 interband interpolation.

Page 8: Robust Transmission of SPIHT-Coded Images Over Packet …

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 17, NO. x, mon. 2007. 8

wavelet kernels. The parameter set {10% packet loss, 0.25 bpp, 9/7 filter} is taken as the basic reference model.

TABLE V PSNR PERFORMANCE UNDER VARIOUS PACKET LOSS RATES (CODED AT 0.25

BPP WITH THE 9/7 FILTER)

(a) Lena

Scenario 1% 5% 10% 20% 1 33.58 33.58 33.58 33.58

2 25.75 18.29 15.24 12.24 3 32.00 27.56 24.81 21.12 4 32.44 29.91 27.72 25.34 5 32.58 30.46 28.61 26.15 6 32.58 30.49 28.67 26.20

(a) Baboon

Scenario 1% 5% 10% 20% 1 22.88 22.88 22.88 22.88 2 20.21 16.57 14.23 11.64 3 22.77 21.30 20.68 19.18 4 22.64 22.41 22.11 21.53 5 22.64 22.42 22.15 21.63 6 22.64 22.42 22.16 21.63

TABLE VI PSNR PERFORMANCE UNDER VARIOUS PACKET LOSS RATES (CODED AT 1.0

BPP WITH THE 9/7 FILTER)

(a) Lena

Scenario 1% 5% 10% 20% 1 39.92 39.92 39.92 39.92

2 27.31 18.44 15.31 12.77 3 35.80 28.92 25.45 21.36 4 37.18 31.91 28.66 25.87 5 37.59 32.86 29.96 26.81 6 37.64 32.86 30.04 26.88

(b) Baboon

Scenario 1% 5% 10% 20% 1 28.62 28.62 28.62 28.62 2 23.44 17.64 14.79 11.92 3 28.19 25.53 23.95 21.11 4 28.22 27.19 26.17 24.58 5 28.23 27.29 26.33 24.81 6 28.24 27.33 26.39 24.88

(a)

(b) (c)

(d) (e)

Fig. 10. Subjective evaluation for Lena coded at 0.25 bpp with 10% packet losses. (a) Scenario 2 (15.17 dB), (b) Scenario 3 (24.44 dB), (c) Scenario 4 (28.26dB), (d) Scenario 5 (28.96 dB), and (e) Scenario 6 (29.17 dB).

(a)

(b) (c)

(d) (e)

Fig. 11. Zoomed portions in Fig. 10. (a) Scenario 1 (lossless), (b) Scenario 3, (c) Scenario 4, (d) Scenario 5, and (e) Scenario 6.

Page 9: Robust Transmission of SPIHT-Coded Images Over Packet …

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 17, NO. x, mon. 2007. 9

TABLE VII PSNR PERFORMANCE (IN DB) UNDER VARIOUS BIT RATES (10% PACKET LOSSES)

(a) Lena

Scenario 0.125 0.25 0.5 1.0 1 30.52 33.58 36.73 39.92

2 15.16 15.24 15.29 15.31 3 24.14 24.81 25.23 25.45 4 26.70 27.72 28.33 28.66 5 27.22 28.61 29.48 29.96 6 27.25 28.67 29.55 30.04

(b) Baboon

Scenario 0.125 0.25 0.5 1.0 1 21.49 22.87 25.11 28.62 2 14.09 14.23 14.55 14.79 3 20.12 20.68 22.28 23.95 4 20.94 22.11 23.85 26.17 5 20.96 22.15 23.93 26.33

B. Quantization Error vs. Channel Error Two sources of errors, the quantization noise and the

transmission loss, account for the overall degradation of the SPIHT-coded images transmitted over lossy networks. The quantization noise is dictated by the coding rate. We have examined four bit rates, 0.125, 0.25, 0.5, and 1.0 bpp, all at 10% packet losses. The PSNR performance is listed in Table 7. It is of interest to investigate the relative impact of these two degradation sources under various bit rates. We compute the percentage of quantization error to the total mean square error, denoted by α, as follows

%100studyunder scenario in the MSE

1 Scenarioin MSE×=α (7)

Note that Scenario 1 sustains only the quantization error. The values of α as a function of the bit rate are graphed in Fig. 12 for Scenarios 2-6. As is expected, α is a decreasing function of the bit rate. The channel error dominates the total error for the unprotected case (Scenario 2); very limited PSNR improvement could be achieved with a higher coding bit rate. For a hard-to-compress image such as Baboon with adequate error protection (Scenarios 4-6), the total error is largely attributed to quantization. It should also be commented that the channel error of the same MSE causes more severe visual degradation than the quantization error.

0

10

20

30

40

50

0.125 0.25 0.5 1 Bite Rate

% Scenario2

Scenario3

Scenario4

Scenario5

Scenario6

(a) Lena

0

20

40

60

80

0.125 0.25 0.5 1 Bite Rate

% Scenario2

Scenario3

Scenario4

Scenario5

Scenario6

(b) Baboon

Fig. 12. Percentage of the quantization error to the total error (at 10% packet loss).

C. Effects of Wavelets Filters The choice of the wavelet filters is another factor that

influences the performance of the proposed error protection scheme. In addition to the Daubechies 9/7 filter [13], we have also tested the LeGall 5/3 filter [14], as both filters are adopted in the JPEG-2000 standard [32]. The 9/7 filter is nearly orthogonal, and it achieves excellent compression performance with floating-point operations. In contrast, the 5/3 filter achieves reasonable compression performance with simpler fixed-point operations. In Table 8, we compare the error performance of these two filters. The 9/7 filter provides substantial edge (0.5-1.0 dB) over the 5/3 filter in a lossless environment (Scenario 1). The superior energy-compacting capability of the 9/7 filter, however, results in its poorer performance for the unprotected case (Scenario 2). The coding margin of the 9/7 filter can be mostly restored when an error recovery scheme is employed (Scenarios 3-6).

Page 10: Robust Transmission of SPIHT-Coded Images Over Packet …

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 17, NO. x, mon. 2007. 10

TABLE VIII PSNR PERFORMANCE FOR 9/7 AND 5/3 FILTERS (CODED AT 0.25 BPP)

(a) 1% packet loss

Lena Baboon Scenario 9/7 5/3 9/7 5/3

1 33.58 32.59 22.87 22.24 2 25.75 27.83 20.21 20.64 3 32.00 31.19 22.77 22.15 4 32.44 31.51 22.64 22.08 5 32.58 31.61 22.64 22.08 6 32.58 31.65 22.64 22.08

(b) 10% packet loss

Lena Baboon Scenario 9/7 5/3 9/7 5/3

1 33.58 32.59 22.87 22.24 2 15.24 17.86 14.23 16.27 3 24.81 25.25 20.68 21.36 4 27.72 27.30 22.11 21.57 5 28.61 28.08 22.15 21.63 6 28.67 28.12 22.16 21.65

(c) 20% packet loss

Lena Baboon Scenario 9/7 5/3 9/7 5/3

1 33.58 32.59 22.87 22.24 2 12.24 14.52 11.64 13.81 3 21.12 22.48 19.18 20.60 4 25.34 24.88 21.53 21.03 5 26.14 25.66 21.63 21.15 6 26.20 25.66 21.63 21.19

V. CONCLUSION We have presented a new hybrid error-protection scheme for

SPIHT-coded images in this paper. The error-resilient encoder deinterleaves the embedded wavelet bit stream and independently packetizes each of the spatial-orientation trees. The essential information of a packet, including the level-1 coefficients and the signs of level-2 coefficients, is duplicated in its sibling packets. When transmitted over the packet-erasure channels, the possibly lost wavelet information is recovered in two ways. The missing essential information is retrieved from other successfully received packets. The missing level-2 magnitudes are estimated by linear interpolation from closely related wavelet coefficients, either intra-subband or inter-subband, where the latter achieves better performance. Substantial MDC and interpolation gain is found for smooth images and high-packet-loss network conditions. We have also explored the effects of the coding bit rate and wavelet filter on their robustness performance. The proposed method generally provides better error resilience than conventional robust-coding techniques for embedded wavelet coders. Adaptation of the data partitioning and packetization scheme to a specific communication network could be explored in the future.

ACKNOWLEDGMENT The authors would like to thank the anonymous reviewers

for their valuable suggestions and comments.

REFERENCES [1] J. M. Shapiro, “Embedded image coding using zerotrees of wavelet

coefficients,” IEEE Trans. Signal Processing, vol. 41, no. 12, pp. 3445-3462, Dec. 1993.

[2] A. Said and W. A. Pearlman, “A new, fast, and efficient image codec based on set partitioning in hierarchical trees,” IEEE Trans. Circuits Syst. Video Technol., vol. 6, no. 3, pp. 243-250, June 1996.

[3] Y. Wang and Q.-F. Zhu, “Error control and concealment for video communications: A review,” Proc. of the IEEE, vol. 86, no. 5, pp. 974 - 997, May 1998.

[4] P. G. Sherwood and K. Zeger, “Progressive image coding for noisy channels,” IEEE Signal Processing Lett., vol. 4, no. 7, pp. 189-191, July 1997.

[5] H. Man, F. Kossentini, and M. J. T. Smith, “Robust EZW image coding for noisy channels,” IEEE Signal Processing Lett., vol. 4, no. 8, pp. 227-229, Aug. 1997.

[6] C. D. Creusere, “A new method of robust image compression based on the embedded zerotree wavelet algorithm,” IEEE Trans. Image Processing, vol. 6, no. 10, pp. 1436-1442, Oct. 1997.

[7] S.-H. Yang and T.-C. Cheng, “Error-resilient SPIHT image coding,” Electron. Lett., vol.36, no. 3, pp. 208-210, Feb. 2000.

[8] T. Kim, S. Choi, R. E. Van Dyck, and N. K. Bose, “Classified zerotree wavelet image coding and adaptive packetization for low-bit-rate transport,” IEEE Trans. Circuits Syst. Video Technol., vol. 11, no. 9, pp. 1022-1034, Sep. 2001.

[9] A. C. Miguel, A. E. Mohr, and E. A. Riskin, “SPIHT for generalized multiple description coding,” in Proc. IEEE Int. Conf. Image Processing, vol. 3, pp. 842-846, Kobe, Japan, Oct. 1999.

[10] N. Varnica, M. Fleming, and M. Effros, “Multi-resolution adaptation of the SPIHT algorithm for multiple description,” in Proc. Data Compression Conf., pp. 303 –312, Snowbird, Utah, USA, Mar. 2000.

[11] P. G. Sherwood, X. Tian, and K. Zeger, “Efficient image and channel coding for wireless packet networks,” in Proc. IEEE Int. Conf. Image Processing, vol. 2, pp. 132-135, Vancouver, Canada, 2000.

[12] Y. Wang, S. Wenger, J, Wen, and A. K. Katsaggelos, “Error resilient video coding techniques,” IEEE Signal Processing Mag., vol. 17, no. 4, pp. 61-82, July 2000.

[13] M. Antonini, M. Barlaud, P. Mathieu, and I. Daubechies, “Image coding using wavelet transform,” IEEE Trans. Image Processing, vol. 1, no. 2, pp. 205-220, Apr. 1992.

[14] D. LeGall and A. Tabatabai, “Subband coding of digital images using symmetric short kernel filters and arithmetic coding techniques,” in IEEE Int. Conf. Acoustic, Speech, Signal Processing, pp. 761-765, New York, USA, 1988.

[15] M. Unser and T. Blu, “Mathematical properties of the JPEG2000 wavelet filters,” IEEE Trans. Image Processing, vol. 12, no. 9, pp. 1080-1090, Sep. 2003.

[16] J. K. Rogers and P. C. Cosman, “Wavelet zerotree image compression with packetization,” IEEE Signal Processing Letters, vol. 5, no. 5, pp. 105-107, May 1998.

[17] X. Wu, S. Cheng, and Z. Xiong, “On packetization of embedded multimedia bitstreams,” IEEE Trans. Multimedia, vol. 3, no. 1, pp. 132-140, March 2001.

[18] V. K. Goyal, “Multiple description coding: compression meets the network,” IEEE Signal Processing Mag., vol. 18, no. 5, pp. 74-93, Sep. 2001.

[19] W. Jiang and A, Ortega, “Multiple description coding via polyphase transform and selective quantization,” in Proc. SPIE Visual Commun. Image Processing Conf., San Jose, CA, USA, 1999.

[20] S. D. Servetto, K. Ramchandran, V. A. Vaishampayan, and K. Nahrstedt, “Multiple description wavelet based image coding,” IEEE Trans. Image Processing, vol. 9, no. 5, pp. 813-826, May 2000.

[21] Special Issue on Error-Resilient Image and Video Transmission, IEEE J. Select. Areas Commun., vol. 18, no. 6, June 2000.

[22] X. Li and M. T. Orchard, “Novel sequential error-concealment techniques using orientation adaptive interpolation,” IEEE Trans. Circuits Syst. Video Technol., vol. 12, no. 10, pp. 857-864, Oct. 2002.

Page 11: Robust Transmission of SPIHT-Coded Images Over Packet …

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 17, NO. x, mon. 2007. 11

[23] S. D. Rane, J. Remus, and G. Sapiro, “Wavelet-domain reconstruction of lost blocks in wireless image transmission and packet-switched networks,” in Proc. IEEE Int. Conf. Image Processing, vol. 1, pp. 22-25, Rochester, New York, USA, Sep. 2002.

[24] F. Labeau, C. Desset, L. Vandendorpe, and B. Macq, “Performance of linear tools and models for error detection and concealment in subband image transmission,” IEEE Trans. Image Processing, vol. 11, no. 5, pp. 518-529, May 2002.

[25] A. Papoulis, Probability, Random Variables, and Stochastic Processes, second edition, McGraw-Hill Inc., 1984.

[26] R. W. Buccugrossi and E. P. Simoncelli, “Image compression via joint statistical characterization in the wavelet domain,” IEEE Trans. Image Processing, vol. 8, no. 12, pp. 1688-1701, Dec. 1999.

[27] J. Liu and O. Moulin, “Information-theoretic analysis of interscale and intrascale dependencies between wavelet coefficients,” IEEE Trans. Image Processing, vol. 10, no. 11, pp. 1647-1658, Nov. 2001.

[28] S. S. Hemami and R. M. Gray, “Subband-coded image reconstruction for lossy packet networks,” IEEE Trans. Image Processing, vol. 6, no. 4, pp. 523-539, Apr. 1997.

[29] A. T. Deever and S. S. Hemami, “Efficient sign coding and estimation of zero-quantized coefficients in embedded wavelet image codecs,” IEEE Trans. Image Processing, vol. 12, no. 4, pp. 420-430, Apr. 2003.

[30] X. Li, “On exploiting geometric constraint of image wavelet coefficients,” IEEE Trans. Image Processing, vol. 12, no. 11, pp. 1378-1387, Nov. 2003.

[31] M. W. Marcellin and A. Bilgin, “Quantifying the parent-child coding gain in zero-tree-based coders,” IEEE Signal Processing Lett., vol. 8, no. 3, pp. 67-69, March 2001.

[32] M. D. Adams and F. Kossentini, “Reversible integer-to-integer wavelet transforms for image compression: performance and analysis,” IEEE Trans. Image Processing, vol. 9, no. 6, pp. 1010-1024, June 2000.

Shih-Hsuan Yang (S’89-M’94-SM’06) received the B.S. degree in electrical engineering from the National Taiwan University in 1987. He obtained the M.S. and Ph.D. degrees in electrical engineering and computer science from the University of Michigan, Ann Arbor, in 1990 and 1994, respectively.

Dr. Yang joined the National Taipei University of Technology, Taipei, Taiwan, in 1994, where he is currently Professor of Computer Science and

Information Engineering (CSIE). He was the chairman of the CSIE Department from 2004 to 2006. His major research interests include image and video coding, multimedia transmission, data hiding, and information theory.

Po-Feng Cheng received the M.Sc. degree in computer and communication engineering form the National Taipei University of Technology, Taipei, Taiwan in 2003. His master thesis was to investigate and propose encoding algorithms for recovering the corrupted frames during the image transmission. He has been a researcher in the Industrial Technology Research Institute, Taiwan, since 2004. His current research interest is the object tracking for video surveillance systems.