Distributed Source Coding By Raghunadh K Bhattar, EE Dept, IISc Under the Guidance of...
-
Upload
beatrix-collins -
Category
Documents
-
view
225 -
download
0
Transcript of Distributed Source Coding By Raghunadh K Bhattar, EE Dept, IISc Under the Guidance of...
Distributed Source Coding
ByRaghunadh K Bhattar, EE Dept, IISc Under the Guidance ofProf.K.R.Ramakrishnan
Outline of the Presentation Introduction Why Distributed Source Coding Source Coding How Source Coding Works How Channel Coding Works Distributed Source Coding Slepian-Wolf Coding Wyner-Ziv Coding Applications of DSC Conclusion
Why Distributed Source Coding ? Low Complexity Encoders Error Resilience – Robust to transmission errors
The above two attributes make the DSC an enabling technology for wireless communications
Low Complexity Wireless Handsets
Courtesy Nicolas Gehrig
Distributed Source Coding (DSC) Compression of Correlated Sources –
Separate Encoding & Joint Decoding
Encoder
Encoder
X
Y
Decoder YX ,
XR
YR
Statistically dependent
But Physical Distinct
Source Coding (Data Compression) Exploit the redundancy in the source to reduce the data
required for storage or for transmission Highly complex encoders are required for compression
(MPEG, H.264 …) However, Simple decoders ! The Highly complex encoders require
Bulky handsets Power consumption Battery Life
How Source Coding Works Types of redundancy
Spatial redundancy - Transform or predictive coding Temporal redundancy - Predictive coding
In predictive coding, the next value in the sequence is predicted from the past values and the predicted value is subtracted from the actual value
The difference is only sent to the decoder Let the Past values are in C, the predicted value is y =
f(C). If the actual value is x, then (x – y) sent to the decoder.
Decoder, knowing the past values C, can also predict the value of y.
With the knowledge of (x – y), the decoder finds the value of x, which is the desired value
x
Past Values (C)
Prediction
y-
+x - y x
Past Values (C)
Prediction
y-
+x - y
Encoder Decoder
Compression – Toy Example
Suppose, X and Y – Two uniformly distributed i.i.d Sources.X, Y X 3bits If they are related (i.e., correlated)000 Y 3bits Can we reduce the data rate? 001 Let the relation be : 010 X and Y differ at most by one bit011 i.e., the Hamming distance 100 between X and Y is maximum one101110111
i
ii yx 1
H(X) = H(Y) = 3bits
Let Y = 101 then X = 101 (0), 100 (1), 111 (2), 001 (3)
Code = XY
H(X/Y) = 2bits
Need 2bits to transmit X and 3bits for Y and total 5 bits for both X and Y instead of 6bits.
Here we should know what is the outcome of Y, then we code the X with 2bits.
Decoding = Y Code; Code 0 = 000, 1 = 001, 2 = 010, 3 = 100;
Now assume that, we don’t know the outcome of the Y (but sent to decoder using 3bits), can I still transmit X using 2 bits?
The Answer is YES (Surprisingly!) How?
Partition
Group all the symbols into four groups each consists of two members
{(000),(111)} 0 Trick = Partition each {(001),(110)} 1 set with Hamming {(010),(101)} 2 distance 3 {(100),(011)} 3
The encoding of X is simply done by sending the index of the set that actually contains X
Let X = (100) then the index for X = 3 Let the decoder already received a correlated Y
(101) How we recover X knowing the Y (101) (from now
onwards call this as side information) at decoder and index (3) from X
Since, index is 3 we know that the value of X is either (100) or (011)
Measure the Hamming distance between the two possible values of X with side information Y
(100)(101) = (001) = Ham dis = 1 (011)(101) = (110) = Ham dis = 2 X = (100)
Source Coding
Y = 101 X = 100 Code = (100)(101)
= 001 = 1 Decoding =
YCode =
=(101)(001)
= 100 = X
Y Code X’
000 001 001 X001 001 000 X010 001 011 X011 001 010 X100 001 010 X101 001 100 110 001 111 X111 001 110 X
Side Information Decoding Output
Y = 000 011000 = 2
100000 = 1
X = 100
Y = 110 011110 = 2
100110 = 1
X = 100
Y = 101 011101 = 2
100101 = 1
X = 100
Y = 100 011100 = 3
100100 = 0
X = 100
Y = 111 011111 = 1
100111 = 2
X = 011
Cor
rela
ted
Side
Inf
orm
atio
n
Uncorrelated Side Information
No E
rror in Decoding
Erroneous Decoding
Distributed Source Coding
X =100
Code = 3
How to partition the input sample space? Always I have to find some trick ? If input sample space is large (even if few hundreds), can I still find the trick???
The trick is matrix multiplication and we have to have one such matrix, which partition the input space.
For above toy example the matrix is
101
011H
Index = X*HT in GF(2) field
H is the parity check matrix in
Error correction terminology
Coset Partition
Now, we see again the partitions.
{(000),(111)} 0 This is the repetition code
{(001),(110)} 1 (in error correction
{(010),(101)} 2 terminology.)
{(100),(011)} 3 These are the Cosets of
the repetition code induced by the elements of the sample space of X
Channel Coding
In channel coding, controlled redundancy is added to the information bits to protect the them from channel noise
We can classify channel coding or error control coding into two categories Error Detection Error Correction
In Error Detection, the introduced redundancy is just enough to detect errors
In Error Correction, we need to introduce more redundancy.
dmin = 1
dmin= 2
dmin= 3
2
1mindt
Parity Check
X Parity 000 0 001 1 010 1 011 0 100 1 101 0 110 0 111 1
Minimum Hamming Distance =2
How to make Hamming distance 3 ?
It is not clear (or not easy to make minimum hamming distance 3)
Slepian-Wolf theorem: The Slepian-Wolf theorem states that the correlated sources that don’t communicate each other can be coded at a rate equal to the rate at which they are coded jointly. No performance loss occurs if they are decoded jointly.
)|(
)|(
),(
XYHR
YXHR
YXHRR
Y
X
YX
• When correlated sources are coded independently, but decoded jointly, then the minimum data rate for each source is lower bounded by
• Total data rate should be atleast equal to (or greater) than H(X,Y) and individual data rates should be atleast equal to (or greater than) H(X/Y) and H(Y/X) respectively
J. D. Slepian and J. K. Wolf, “Noiseless coding of correlated information sources,” IEEE Trans. Inf. Theory, vol. 19, pp. 471–480, July 1973.
DISCUS (DIstributed Source Coding Using
Syndrome) The first constructive realization of the Slepian-Wolf boundary
using practical channel codes was proposed where single-parity check codes were used with the binning scheme.
Wyner first proposed to use capacity achieving binary linear channel code to solve the SW compression problem for a class of joint distributions
DISCUS extended the results of Wyner idea to the distributed rate-distortion (lossy compression) problem using channel codes
S. Pradhan and K.Ramchandran, “Distributed source coding using syndrome(DISCUS),” in IEEE Data Compression Conference, DCC-1999, Snowbird,UT, 1999.
EncoderDecoder
YX ,X
Y
)|( YXHRX
)(YHRY
Statistically dependent
Distributed Source Coding
(Compression with Side Information)
Side InformationAvailable at Decoder
Lossless
Rx
Ry
H(X,Y)
H(Y)
H(Y/X)
H(X/Y) H(X) H(X,Y)0
Rx + Ry = H(X,Y) = Joint Encoding and Decoding
Separate Coding
No ErrorsRx > H(X) Ry > H(Y)
Achievable Rates with Slepian-Wolf Coding
A
B
C
Achievable Rate Region - SWC
Rx > H(X/Y)Ry > H(Y)
Rx > H(X/Y)Ry > H(Y)
R
H(X,Y)H(X)H(X|Y) RX
RY
H(Y)
H(Y|X)
No errors
Vanishing errorprobability for
long sequences
Code X with Y as side-information
Code Y with X as side-information
Time-sharing/Source splitting/Code partitioning
How Compression Works ?
Redundant Data (Correlated Data)
Remove Redundancy
Compressed Data
(Decorrelated Data)
Decorrelated Data
Redundant Data Generator
+ Correlated Data
How Channel Coding Works ?
Duality Between
Source Coding and Channel Coding
Source Coding Channel Coding
Compress the Data Expands the Data
De-correlates the Data Correlates the Data
Complex Encoder Simple Encoder
Simple Decoder Complex Decoder
Channel
(Additive Noise)
Channel Coding or Error Correction CodingIn
form
atio
n B
itsP
arity
Bits
k
n - k
n
Cod
e W
ord
Data Expansion = timesk
n
Channel
Decoding Info
rmat
ion
Bits
Par
ity B
itsInf
orm
atio
n B
its
Info
rmat
ion
Bits
(X
)k
x
Decompression
+ Correlation Model ( = Noise) Y=
Par
ity B
its
Y= Channel
Decodingx
Par
ity B
its
n - k
Parity Bits
Compressed Data
Channel
Coding
Y
Channel Codes for DSC
X
1
2
nRX
Interleaver length L
1PX
XL bits in
L bitsSystematic Convolutional
Encoder Rate n
n 1
1n
Lbits
Discarded
2PX
Systematic Convolutional
Encoder Rate n
n 1
1n
Lbits
L bitsDiscarded
Curtsey
Anne Aaron and Bernd Girod
Turbo Coder for Slepian-Wolf Encoding
Interleaver length L
1PX Channel
probabilities
calculations
1n
Lbits in
2PX Channel
probabilities
calculations
Y
1n
Lbits in
)|( yxP
SISO
Decoder
Pchannel
PextrinsicPa priori
Interleaver length L
Deinterleaver length L
SISO
Decoder Pchannel
Pextrinsic Pa prioriDeinterleaver
length L
Decision X
Pa posteriori
Pa posteriori
Turbo Decoder for Slepian-Wolf Decoding
Curtsey
Anne Aaron and Bernd Girod
Wyner’s Scheme Use a linear block code, send syndrome (n,k) block code, 2(n-k) syndromes, each corresponding to
a set of 2k words of length n. Each set is a coset code. Compression ratio of n:(n-k).
Lossless Encoder(Syndrome Former)
J oint Decoder
Syndrome bits
X X̂
R ≥ H(X|Y)Y
A D Wyner, "Recent Results in the Shannon Theory” in IEEE Transactions On Information Theory, VOL. IT-20, NO. 1, JANUARY 1974A. D. Wyner, “On source coding with side information at the decoder,” IEEE Trans. Inf. Theory, vol. 21, no. 3, pp. 294–300, May 1975.
Linear Block Codes for DSC
n
x
Decompression
Correlation Model for Side Information
+ Correlation Model ( = Noise) Y=
Syndrome
Decoding
Compression Ratio = kn
n
Syn
drom
e F
orm
er
n - k
Y
x =
Cor
rup
ted
Cod
ewor
d
Compressed Data
H x
Yn
=
Corrupted Codeword
LDPC Encoder (Syndrome Former
Generator)
X Compressed Data
TxHs
Syndrome (s)
LDPC Decoders
Side Information (Y)
Decompressed Data X̂
Entropy C
oding
Y
Correlation Model
The Wyner-Ziv theorem Wyner and Ziv extended the work by Slepian and Wolf by
studying the lossy case in the same scenario, where signals X and Y are statistically dependent.
Y is transmitted at a rate equal to its entropy (Y is then called Side Information) and what needs to be found is the minimum transmission rate for X that introduces no more than a certain distortion D.
The Wyner-Ziv rate-distortion function, which is the lowest bound for Rx.
For MSE distortion and Gaussian statistics, rate-distortion functions of the two systems are the same.
A.D.Wyner and J.Ziv, “The rate distortion function for source coding with side information at the decoder,” IEEE Transactions on Information theory, vol. 22, no. 1, pp. 1–10, January 1976.
A codec that intends to separately encode signals X and Y while jointly decoding them, but does not aim at recovering them perfectly, it expects some distortion D in the reconstruction is called a Wyner-Ziv codec.
Wyner-Ziv Codec
Encoder Encoder Decoder Decoder X X
YY
Encoder Encoder Decoder Decoder X X
Y
Wyner-Ziv Coding Lossy Compression with Side Information
For MSE distortion and Gaussian statistics, rate-distortion functions of the two systems are the same.
The rate loss R*(d) – RX|Y (d) is bounded.
RX|Y (d)
R*(d)
The structure of the Wyner-Ziv encoding and decoding
Encoding consists of quantization followed by a binning operation encoding U into Bin (Coset) index.
Structure of distributed decoders. Decoding consists of “de-binning” followed by estimation
Wyner-Ziv Coding (WZC) - A joint source-channel coding problem
Wyner-Ziv Decoder
Scalar Quantizer
LDPC Encoder
Buffer
WZ frames
X
Wyner-Ziv Encoder
LDPC Decoder
Request bits
Slepian-Wolf Codec
Interpolation/ Extrapolation
Reconstruction
Y
Key frames
I Conventional Intraframe coding
Conventional Intraframe decoding
X’
I’
Side information
Pixel-Domain Wyner-Ziv Residual Video Codec
Xer
-Xer
-Q-1
Frame Memory
Distributed Video Coding
Distributed coding is a new paradigm for video compression, based on Slepian and Wolf’s (lossless coding) and Wyner and Ziv’s (lossy coding) information theoretic results.
Enables low-complexity video encoding where the bulk of the computation is shifted to the decoder.
A second architectural goal is to allow for far greater robustness to packet and frame drops.
Useful for wireless video applications by means of transcoding architecture use.
PRISM
PRISM (Power-efficient, Robust, hIgh compression Syndrome based Multimedia)
The PRISM is a practical video coding framework built on distributed source coding principles.
Flexible encoding/decoding complexity High compression efficiency Superior robustness to packet/frame drops Light yet rich encoding syntax R. Puri, A. Manjumdar, and K.Ramchandran, “PRISM: A video coding
paradigm with motion estimation at the decoder,” IEEE Transactions on Image Processing, vol. 16, no. 10, pp. 2436–2448, October 2007.
DIStributed COding for Video sERvices (DISCOVER)
DISCOVER is a new video coding scheme which has a strong potential of new applications, targeting new advances in coding efficiency, error resilience and scalability
At the encoder side the video is split into two parts. The first set of frames called key frame are encoded with
conventional H.264/AVC encoder. The remaining frames known as Wyner-Ziv frames which are
coded using distributed coding principle X.Artigas, J.ascenso, M.Dalai, D.Kubasov, and M.quaret, “The discover
codec: Architecture, techniques and evaluation,” Picture Coding Symposium, 2007.
www.discoverdvc.org
A Typical Distributed Video coding
Y
Side Information from Motion-Compensated Interpolation
WZ frame
W W’
Decoded WZ frames
Side information
Wyner-Ziv Residual Encoder
Wyner-Ziv Residual Encoder
WZ parity bits
InterpolationInterpolationDecoded frames
Wyner-Ziv Residual Decoder
Wyner-Ziv Residual Decoder
I II
Previous key frame as encoder reference
I IIwz wz
Wyner-Ziv DCT Video CodecWZ frames
W
Request bits
Interpolation/ Extrapolation
Reconstruction
Key framesK Conventional
Intraframe coding
Conventional Intraframe decoding
DCT
For each transform band k
K’
W’
Y
Yk
Xk Xk’
IDCT
Decoded WZ frames
level Quantizer
DCT
kM2 Turbo Encoder
BufferTurbo
DecoderExtract bit-
planes
qk
bit-plane 1
bit-plane 2
bit-plane Mk
…
qk’
Interframe Decoder
Intraframe Encoder
level Quantizer
DCT
kM2 Turbo Encoder
BufferTurbo
DecoderExtract bit-
planes
Interpolation/ Extrapolation
Side information
Foreman sequence
Side information After Wyner-Ziv Coding16-level quantization (~1 bpp)
Sample Frame (Foreman)
Side information After Wyner-Ziv Coding16-level quantization (~1 bpp)
Carphone Sequence
H263+ Intraframe Coding 410 kbps
Wyner-Ziv Codec 384 kbps
DCT-based Intracoding 247 kbps
PSNRY=33.0 dB
Wyner-Ziv DCT codec 256 kbps
PSNRY=39.1 dB GOP=16
Salesman sequence at 10 fps
H.263+ I-P-P-P 249 kbps
PSNRY=43.4 dB GOP=16
Wyner-Ziv DCT codec 256 kbps
PSNRY=39.1 dB GOP=16
Salesman sequence at 10 fps
DCT-based Intracoding 231 kbps
PSNRY=33.3 dB
Wyner-Ziv DCT codec 227 kbps
PSNRY=39.1 dB GOP=16
Hall Monitor sequence at 10 fps
H.263+ I-P-P-P 212 kbps
PSNRY=43.0 dB GOP=16
Wyner-Ziv DCT codec 227 kbps
PSNRY=39.1 dB GOP=16
Hall Monitor sequence at 10 fps
Facsimile Image Compression with DSCCCITT 8 Image
Reconstructed Image with 30 errors
Fax4 Reconstructed Image with 8 errors
• Very low complexity encoders
• Compression for networks of cameras
• Error-resilient transmission of signal waveforms
• Digitally enhanced analog transmission
• Unequal error protection without layered coding
• Image authentication
• Random access
• Compression of encrypted signals
Applications
Cosets
Let G = {000,001,…111} Let H = {000,111} is a subgroup of G Coset are 001000 = 001 001111 = 110 Hence, {001, 110} is one coset 010000 = 010 010111 = 101 {010, 101} is another coset and so on…
Hamming Distance
Hamming distance is a distance measure defined as the number of bits two binary sequence differ
Let X and Y be two binary equences, the Hamming distance between X and Y is defined as
Hamming distance = Example Let X = {0 0 1 1 1 0 1 0 1 0} Let Y = {0 1 0 1 1 1 0 0 1 1} Hamming distance Sum({0 1 1 0 0 1 1 0 0 1}) = 5
i
ii yx