Distributed Source Coding By Raghunadh K Bhattar, EE Dept, IISc Under the Guidance of...

Distributed Source Coding

ByRaghunadh K Bhattar, EE Dept, IISc Under the Guidance ofProf.K.R.Ramakrishnan

Outline of the Presentation Introduction Why Distributed Source Coding Source Coding How Source Coding Works How Channel Coding Works Distributed Source Coding Slepian-Wolf Coding Wyner-Ziv Coding Applications of DSC Conclusion

Why Distributed Source Coding ? Low Complexity Encoders Error Resilience – Robust to transmission errors

The above two attributes make the DSC an enabling technology for wireless communications

Low Complexity Wireless Handsets

Courtesy Nicolas Gehrig

Distributed Source Coding (DSC) Compression of Correlated Sources –

Separate Encoding & Joint Decoding

Encoder

Encoder

X

Y

Decoder YX ,

XR

YR

Statistically dependent

But Physical Distinct

Source Coding (Data Compression) Exploit the redundancy in the source to reduce the data

required for storage or for transmission Highly complex encoders are required for compression

(MPEG, H.264 …) However, Simple decoders ! The Highly complex encoders require

Bulky handsets Power consumption Battery Life

How Source Coding Works Types of redundancy

Spatial redundancy - Transform or predictive coding Temporal redundancy - Predictive coding

In predictive coding, the next value in the sequence is predicted from the past values and the predicted value is subtracted from the actual value

The difference is only sent to the decoder Let the Past values are in C, the predicted value is y =

f(C). If the actual value is x, then (x – y) sent to the decoder.

Decoder, knowing the past values C, can also predict the value of y.

With the knowledge of (x – y), the decoder finds the value of x, which is the desired value

x

Past Values (C)

Prediction

y-

+x - y x

Past Values (C)

Prediction

y-

+x - y

Encoder Decoder

Compression – Toy Example

Suppose, X and Y – Two uniformly distributed i.i.d Sources.X, Y X 3bits If they are related (i.e., correlated)000 Y 3bits Can we reduce the data rate? 001 Let the relation be : 010 X and Y differ at most by one bit011 i.e., the Hamming distance 100 between X and Y is maximum one101110111

i

ii yx 1

H(X) = H(Y) = 3bits

Let Y = 101 then X = 101 (0), 100 (1), 111 (2), 001 (3)

Code = XY

H(X/Y) = 2bits

Need 2bits to transmit X and 3bits for Y and total 5 bits for both X and Y instead of 6bits.

Here we should know what is the outcome of Y, then we code the X with 2bits.

Decoding = Y Code; Code 0 = 000, 1 = 001, 2 = 010, 3 = 100;

Now assume that, we don’t know the outcome of the Y (but sent to decoder using 3bits), can I still transmit X using 2 bits?

The Answer is YES (Surprisingly!) How?

Partition

Group all the symbols into four groups each consists of two members

{(000),(111)} 0 Trick = Partition each {(001),(110)} 1 set with Hamming {(010),(101)} 2 distance 3 {(100),(011)} 3

The encoding of X is simply done by sending the index of the set that actually contains X

Let X = (100) then the index for X = 3 Let the decoder already received a correlated Y

(101) How we recover X knowing the Y (101) (from now

onwards call this as side information) at decoder and index (3) from X

Since, index is 3 we know that the value of X is either (100) or (011)

Measure the Hamming distance between the two possible values of X with side information Y

(100)(101) = (001) = Ham dis = 1 (011)(101) = (110) = Ham dis = 2 X = (100)

Source Coding

Y = 101 X = 100 Code = (100)(101)

= 001 = 1 Decoding =

YCode =

=(101)(001)

= 100 = X

Y Code X’

000 001 001 X001 001 000 X010 001 011 X011 001 010 X100 001 010 X101 001 100 110 001 111 X111 001 110 X

Side Information Decoding Output

Y = 000 011000 = 2

100000 = 1

X = 100

Y = 110 011110 = 2

100110 = 1

X = 100

Y = 101 011101 = 2

100101 = 1

X = 100

Y = 100 011100 = 3

100100 = 0

X = 100

Y = 111 011111 = 1

100111 = 2

X = 011

Cor

rela

ted

Side

Inf

orm

atio

n

Uncorrelated Side Information

No E

rror in Decoding

Erroneous Decoding


X =100

Code = 3

How to partition the input sample space? Always I have to find some trick ? If input sample space is large (even if few hundreds), can I still find the trick???

The trick is matrix multiplication and we have to have one such matrix, which partition the input space.

For above toy example the matrix is

101

011H

Index = X*HT in GF(2) field

H is the parity check matrix in

Error correction terminology

Coset Partition

Now, we see again the partitions.

{(000),(111)} 0 This is the repetition code

{(001),(110)} 1 (in error correction

{(010),(101)} 2 terminology.)

{(100),(011)} 3 These are the Cosets of

the repetition code induced by the elements of the sample space of X

Channel Coding

In channel coding, controlled redundancy is added to the information bits to protect the them from channel noise

We can classify channel coding or error control coding into two categories Error Detection Error Correction

In Error Detection, the introduced redundancy is just enough to detect errors

In Error Correction, we need to introduce more redundancy.

dmin = 1

dmin= 2

dmin= 3

2

1mindt

Parity Check

X Parity 000 0 001 1 010 1 011 0 100 1 101 0 110 0 111 1

Minimum Hamming Distance =2

How to make Hamming distance 3 ?

It is not clear (or not easy to make minimum hamming distance 3)

Slepian-Wolf theorem: The Slepian-Wolf theorem states that the correlated sources that don’t communicate each other can be coded at a rate equal to the rate at which they are coded jointly. No performance loss occurs if they are decoded jointly.

)|(

)|(

),(

XYHR

YXHR

YXHRR

Y

X

YX

• When correlated sources are coded independently, but decoded jointly, then the minimum data rate for each source is lower bounded by

• Total data rate should be atleast equal to (or greater) than H(X,Y) and individual data rates should be atleast equal to (or greater than) H(X/Y) and H(Y/X) respectively

J. D. Slepian and J. K. Wolf, “Noiseless coding of correlated information sources,” IEEE Trans. Inf. Theory, vol. 19, pp. 471–480, July 1973.

DISCUS (DIstributed Source Coding Using

Syndrome) The first constructive realization of the Slepian-Wolf boundary

using practical channel codes was proposed where single-parity check codes were used with the binning scheme.

Wyner first proposed to use capacity achieving binary linear channel code to solve the SW compression problem for a class of joint distributions

DISCUS extended the results of Wyner idea to the distributed rate-distortion (lossy compression) problem using channel codes

S. Pradhan and K.Ramchandran, “Distributed source coding using syndrome(DISCUS),” in IEEE Data Compression Conference, DCC-1999, Snowbird,UT, 1999.

EncoderDecoder

YX ,X

Y

)|( YXHRX

)(YHRY

Statistically dependent


(Compression with Side Information)

Side InformationAvailable at Decoder

Lossless

Rx

Ry

H(X,Y)

H(Y)

H(Y/X)

H(X/Y) H(X) H(X,Y)0

Rx + Ry = H(X,Y) = Joint Encoding and Decoding

Separate Coding

No ErrorsRx > H(X) Ry > H(Y)

Achievable Rates with Slepian-Wolf Coding

A

B

C

Achievable Rate Region - SWC

Rx > H(X/Y)Ry > H(Y)

Rx > H(X/Y)Ry > H(Y)

R

H(X,Y)H(X)H(X|Y) RX

RY

H(Y)

H(Y|X)

No errors

Vanishing errorprobability for

long sequences

Code X with Y as side-information

Code Y with X as side-information

Time-sharing/Source splitting/Code partitioning

How Compression Works ?

Redundant Data (Correlated Data)

Remove Redundancy

Compressed Data

(Decorrelated Data)

Decorrelated Data

Redundant Data Generator

+ Correlated Data

How Channel Coding Works ?

Duality Between

Source Coding and Channel Coding

Source Coding Channel Coding

Compress the Data Expands the Data

De-correlates the Data Correlates the Data

Complex Encoder Simple Encoder

Simple Decoder Complex Decoder

Channel

(Additive Noise)

Channel Coding or Error Correction CodingIn

form

atio

n B

itsP

arity

Bits

k

n - k

n

Cod

e W

ord

Data Expansion = timesk

n

Channel

Decoding Info

rmat

ion

Bits

Par

ity B

itsInf

orm

atio

n B

its

Info

rmat

ion

Bits

(X

)k

x

Decompression

+ Correlation Model ( = Noise) Y=

Par

ity B

its

Y= Channel

Decodingx

Par

ity B

its

n - k

Parity Bits

Compressed Data

Channel

Coding

Y

Channel Codes for DSC

X

1

2

nRX

Interleaver length L

1PX

XL bits in

L bitsSystematic Convolutional

Encoder Rate n

n 1

1n

Lbits

Discarded

2PX

Systematic Convolutional

Encoder Rate n

n 1

1n

Lbits

L bitsDiscarded

Curtsey

Anne Aaron and Bernd Girod

Turbo Coder for Slepian-Wolf Encoding


1PX Channel

probabilities

calculations

1n

Lbits in

2PX Channel

probabilities

calculations

Y

1n

Lbits in

)|( yxP

SISO

Decoder

Pchannel

PextrinsicPa priori


Deinterleaver length L

SISO

Decoder Pchannel

Pextrinsic Pa prioriDeinterleaver

length L

Decision X

Pa posteriori

Pa posteriori

Turbo Decoder for Slepian-Wolf Decoding

Curtsey

Anne Aaron and Bernd Girod

Wyner’s Scheme Use a linear block code, send syndrome (n,k) block code, 2(n-k) syndromes, each corresponding to

a set of 2k words of length n. Each set is a coset code. Compression ratio of n:(n-k).

Lossless Encoder(Syndrome Former)

J oint Decoder

Syndrome bits

X X̂

R ≥ H(X|Y)Y

A D Wyner, "Recent Results in the Shannon Theory” in IEEE Transactions On Information Theory, VOL. IT-20, NO. 1, JANUARY 1974A. D. Wyner, “On source coding with side information at the decoder,” IEEE Trans. Inf. Theory, vol. 21, no. 3, pp. 294–300, May 1975.

Linear Block Codes for DSC

n

x

Decompression

Correlation Model for Side Information

+ Correlation Model ( = Noise) Y=

Syndrome

Decoding

Compression Ratio = kn

n

Syn

drom

e F

orm

er

n - k

Y

x =

Cor

rup

ted

Cod

ewor

d

Compressed Data

H x

Yn

=

Corrupted Codeword

LDPC Encoder (Syndrome Former

Generator)

X Compressed Data

TxHs

Syndrome (s)

LDPC Decoders

Side Information (Y)

Decompressed Data X̂

Entropy C

oding

Y

Correlation Model

The Wyner-Ziv theorem Wyner and Ziv extended the work by Slepian and Wolf by

studying the lossy case in the same scenario, where signals X and Y are statistically dependent.

Y is transmitted at a rate equal to its entropy (Y is then called Side Information) and what needs to be found is the minimum transmission rate for X that introduces no more than a certain distortion D.

The Wyner-Ziv rate-distortion function, which is the lowest bound for Rx.

For MSE distortion and Gaussian statistics, rate-distortion functions of the two systems are the same.

A.D.Wyner and J.Ziv, “The rate distortion function for source coding with side information at the decoder,” IEEE Transactions on Information theory, vol. 22, no. 1, pp. 1–10, January 1976.

A codec that intends to separately encode signals X and Y while jointly decoding them, but does not aim at recovering them perfectly, it expects some distortion D in the reconstruction is called a Wyner-Ziv codec.

Wyner-Ziv Codec

Encoder Encoder Decoder Decoder X X

YY

Encoder Encoder Decoder Decoder X X

Y

Wyner-Ziv Coding Lossy Compression with Side Information

For MSE distortion and Gaussian statistics, rate-distortion functions of the two systems are the same.

The rate loss R*(d) – RX|Y (d) is bounded.

RX|Y (d)

R*(d)

The structure of the Wyner-Ziv encoding and decoding

Encoding consists of quantization followed by a binning operation encoding U into Bin (Coset) index.

Structure of distributed decoders. Decoding consists of “de-binning” followed by estimation

Wyner-Ziv Coding (WZC) - A joint source-channel coding problem

Wyner-Ziv Decoder

Scalar Quantizer

LDPC Encoder

Buffer

WZ frames

X

Wyner-Ziv Encoder

LDPC Decoder

Request bits

Slepian-Wolf Codec

Interpolation/ Extrapolation

Reconstruction

Y

Key frames

I Conventional Intraframe coding

Conventional Intraframe decoding

X’

I’

Side information

Pixel-Domain Wyner-Ziv Residual Video Codec

Xer

-Xer

-Q-1

Frame Memory

Distributed Video Coding

Distributed coding is a new paradigm for video compression, based on Slepian and Wolf’s (lossless coding) and Wyner and Ziv’s (lossy coding) information theoretic results.

Enables low-complexity video encoding where the bulk of the computation is shifted to the decoder.

A second architectural goal is to allow for far greater robustness to packet and frame drops.

Useful for wireless video applications by means of transcoding architecture use.

PRISM

PRISM (Power-efficient, Robust, hIgh compression Syndrome based Multimedia)

The PRISM is a practical video coding framework built on distributed source coding principles.

Flexible encoding/decoding complexity High compression efficiency Superior robustness to packet/frame drops Light yet rich encoding syntax R. Puri, A. Manjumdar, and K.Ramchandran, “PRISM: A video coding

paradigm with motion estimation at the decoder,” IEEE Transactions on Image Processing, vol. 16, no. 10, pp. 2436–2448, October 2007.

DIStributed COding for Video sERvices (DISCOVER)

DISCOVER is a new video coding scheme which has a strong potential of new applications, targeting new advances in coding efficiency, error resilience and scalability

At the encoder side the video is split into two parts. The first set of frames called key frame are encoded with

conventional H.264/AVC encoder. The remaining frames known as Wyner-Ziv frames which are

coded using distributed coding principle X.Artigas, J.ascenso, M.Dalai, D.Kubasov, and M.quaret, “The discover

codec: Architecture, techniques and evaluation,” Picture Coding Symposium, 2007.

www.discoverdvc.org

A Typical Distributed Video coding

Y

Side Information from Motion-Compensated Interpolation

WZ frame

W W’

Decoded WZ frames

Side information

Wyner-Ziv Residual Encoder

Wyner-Ziv Residual Encoder

WZ parity bits

InterpolationInterpolationDecoded frames

Wyner-Ziv Residual Decoder

Wyner-Ziv Residual Decoder

I II

Previous key frame as encoder reference

I IIwz wz

Wyner-Ziv DCT Video CodecWZ frames

W

Request bits


Reconstruction

Key framesK Conventional

Intraframe coding

Conventional Intraframe decoding

DCT

For each transform band k

K’

W’

Y

Yk

Xk Xk’

IDCT

Decoded WZ frames

level Quantizer

DCT

kM2 Turbo Encoder

BufferTurbo

DecoderExtract bit-

planes

qk

bit-plane 1

bit-plane 2

bit-plane Mk

…

qk’

Interframe Decoder

Intraframe Encoder

level Quantizer

DCT

kM2 Turbo Encoder

BufferTurbo

DecoderExtract bit-

planes


Side information

Foreman sequence

Side information After Wyner-Ziv Coding16-level quantization (~1 bpp)

Sample Frame (Foreman)

Side information After Wyner-Ziv Coding16-level quantization (~1 bpp)

Carphone Sequence

H263+ Intraframe Coding 410 kbps

Wyner-Ziv Codec 384 kbps

DCT-based Intracoding 247 kbps

PSNRY=33.0 dB

Wyner-Ziv DCT codec 256 kbps

PSNRY=39.1 dB GOP=16

Salesman sequence at 10 fps

H.263+ I-P-P-P 249 kbps




Salesman sequence at 10 fps

DCT-based Intracoding 231 kbps

PSNRY=33.3 dB



Hall Monitor sequence at 10 fps

H.263+ I-P-P-P 212 kbps




Hall Monitor sequence at 10 fps

Facsimile Image Compression with DSCCCITT 8 Image

Reconstructed Image with 30 errors

Fax4 Reconstructed Image with 8 errors

• Very low complexity encoders

• Compression for networks of cameras

• Error-resilient transmission of signal waveforms

• Digitally enhanced analog transmission

• Unequal error protection without layered coding

• Image authentication

• Random access

• Compression of encrypted signals

Applications

Cosets

Let G = {000,001,…111} Let H = {000,111} is a subgroup of G Coset are 001000 = 001 001111 = 110 Hence, {001, 110} is one coset 010000 = 010 010111 = 101 {010, 101} is another coset and so on…

Hamming Distance

Hamming distance is a distance measure defined as the number of bits two binary sequence differ

Let X and Y be two binary equences, the Hamming distance between X and Y is defined as

Hamming distance = Example Let X = {0 0 1 1 1 0 1 0 1 0} Let Y = {0 1 0 1 1 1 0 0 1 1} Hamming distance Sum({0 1 1 0 0 1 1 0 0 1}) = 5

i

ii yx

Distributed Source Coding By Raghunadh K Bhattar, EE Dept, IISc Under the Guidance of...

Documents

Transcript of Distributed Source Coding By Raghunadh K Bhattar, EE Dept, IISc Under the Guidance of...