DoCoMo USA Labs All Rights Reserved Sandeep Kanumuri, NML Fast super-resolution of video sequences...
-
date post
20-Dec-2015 -
Category
Documents
-
view
214 -
download
0
Transcript of DoCoMo USA Labs All Rights Reserved Sandeep Kanumuri, NML Fast super-resolution of video sequences...
![Page 1: DoCoMo USA Labs All Rights Reserved Sandeep Kanumuri, NML Fast super-resolution of video sequences using sparse directional transforms* Sandeep Kanumuri.](https://reader033.fdocuments.in/reader033/viewer/2022051516/56649d425503460f94a1e02d/html5/thumbnails/1.jpg)
DoCoMo USA Labs All Rights ReservedSandeep Kanumuri, NML
Fast super-resolution of video sequences using sparse directional transforms*
Sandeep KanumuriOnur G. Guleryuz
DoCoMo USA Labs
*Presented at 2008 SIAM Conference on Imaging Science on 07/09/2008
(Animated slides, please use slide show mode)
![Page 2: DoCoMo USA Labs All Rights Reserved Sandeep Kanumuri, NML Fast super-resolution of video sequences using sparse directional transforms* Sandeep Kanumuri.](https://reader033.fdocuments.in/reader033/viewer/2022051516/56649d425503460f94a1e02d/html5/thumbnails/2.jpg)
2DoCoMo USA Labs All Rights Reserved Sandeep Kanumuri, NML
Outline
• System Model
• Motivation
• Prior Work
• Our Solution: SWAT (Sparse Warped transform and Adaptive Thresholding)– Algorithm Flowchart
– Over-complete Transform
– Warped (Directional) Transform
– Over-complete Inverse Transform
– Adaptive Thresholding
• Performance Comparison
• Conclusion
![Page 3: DoCoMo USA Labs All Rights Reserved Sandeep Kanumuri, NML Fast super-resolution of video sequences using sparse directional transforms* Sandeep Kanumuri.](https://reader033.fdocuments.in/reader033/viewer/2022051516/56649d425503460f94a1e02d/html5/thumbnails/3.jpg)
3DoCoMo USA Labs All Rights Reserved Sandeep Kanumuri, NML
System Model
• Design goals1. High Quality Rendering
2. Fast Algorithm (Lower Complexity) – Single Frame, Simple Transform
![Page 4: DoCoMo USA Labs All Rights Reserved Sandeep Kanumuri, NML Fast super-resolution of video sequences using sparse directional transforms* Sandeep Kanumuri.](https://reader033.fdocuments.in/reader033/viewer/2022051516/56649d425503460f94a1e02d/html5/thumbnails/4.jpg)
DoCoMo USA Labs All Rights ReservedSandeep Kanumuri, NML
Motivation
![Page 5: DoCoMo USA Labs All Rights Reserved Sandeep Kanumuri, NML Fast super-resolution of video sequences using sparse directional transforms* Sandeep Kanumuri.](https://reader033.fdocuments.in/reader033/viewer/2022051516/56649d425503460f94a1e02d/html5/thumbnails/5.jpg)
5DoCoMo USA Labs All Rights Reserved Sandeep Kanumuri, NML
Broadcast Video – TV application
Docking station
Low-resolution video signal for mobile phones
Low-resolution video is sent to the docking station
Docking station uses the SWAT algorithm to convert low-resolution video to high-resolution video
High-resolution video is sent to a TV or a large display
BENEFIT: Broadcast programming aimed at mobile phones can also be
used in stationary environments
A.1
A.2
B
Low-resolution video is converted to high-resolution video by the cell phone itself
using the SWAT algorithm and high-resolution video is transmitted to the TV
using local wireless technologies
Only one path (Path A or Path B) is used
![Page 6: DoCoMo USA Labs All Rights Reserved Sandeep Kanumuri, NML Fast super-resolution of video sequences using sparse directional transforms* Sandeep Kanumuri.](https://reader033.fdocuments.in/reader033/viewer/2022051516/56649d425503460f94a1e02d/html5/thumbnails/6.jpg)
6DoCoMo USA Labs All Rights Reserved Sandeep Kanumuri, NML
Broadcast Video – VGA phones
Low-resolution video signal for mobile phones
BENEFIT: SWAT capability allows this cell phone to convert low-resolution
video to high-resolution video
VGA phone with SWAT capability
VGA phone without SWAT capability
![Page 7: DoCoMo USA Labs All Rights Reserved Sandeep Kanumuri, NML Fast super-resolution of video sequences using sparse directional transforms* Sandeep Kanumuri.](https://reader033.fdocuments.in/reader033/viewer/2022051516/56649d425503460f94a1e02d/html5/thumbnails/7.jpg)
7DoCoMo USA Labs All Rights Reserved Sandeep Kanumuri, NML
More Applications…
• Video Quality Enhancement Service– SWAT algorithm can be deployed as a service to enhance the
resolution and quality of videos
• Video Conferencing– A SWAT equipped terminal can show video at a higher zoom level
and with improved quality
• High-quality Image Zooming– SWAT algorithm enables the mobile phone to convert the low quality,
low resolution image into a high quality, high resolution image
![Page 8: DoCoMo USA Labs All Rights Reserved Sandeep Kanumuri, NML Fast super-resolution of video sequences using sparse directional transforms* Sandeep Kanumuri.](https://reader033.fdocuments.in/reader033/viewer/2022051516/56649d425503460f94a1e02d/html5/thumbnails/8.jpg)
8DoCoMo USA Labs All Rights Reserved Sandeep Kanumuri, NML
Prior Work
• Linear solutions– Filter design
• Non-linear solutions– Regularization (Projection onto the model space)
• Signal Sparsity– Iterated Denoising / Shrinkage– Lp-Norm Minimization
• Optical Flow
• Adaptive filtering
• Example-based approaches
– Data Consistency (Projection onto the input space)
![Page 9: DoCoMo USA Labs All Rights Reserved Sandeep Kanumuri, NML Fast super-resolution of video sequences using sparse directional transforms* Sandeep Kanumuri.](https://reader033.fdocuments.in/reader033/viewer/2022051516/56649d425503460f94a1e02d/html5/thumbnails/9.jpg)
9DoCoMo USA Labs All Rights Reserved Sandeep Kanumuri, NML
SWAT Algorithm Flowchart
Output Image/Video
Input Image/Video
Linear Interpolation Filter
Directional Over-completeTransform
Adaptive Thresholding
Directional Over-complete Inverse Transform
Enforce Data Consistency
More iterations?
Low-resolution, low quality
High-resolution, low quality
High-resolution, high quality
yes no
Regularization
![Page 10: DoCoMo USA Labs All Rights Reserved Sandeep Kanumuri, NML Fast super-resolution of video sequences using sparse directional transforms* Sandeep Kanumuri.](https://reader033.fdocuments.in/reader033/viewer/2022051516/56649d425503460f94a1e02d/html5/thumbnails/10.jpg)
10DoCoMo USA Labs All Rights Reserved Sandeep Kanumuri, NML
Linear Interpolation Filter
• A linear interpolation filter is used to form an initial estimate of the high-resolution image/video– However, the quality of interpolation is relatively low
• Popular filter choice– Low pass filter of Daubechies 7/9 Inverse Wavelet
– H.264 Interpolation Filter
• A customized linear interpolation filter can be used, if any of the following is known.– Downsampling filter (if the input was obtained by downsampling a
higher resolution original)
– Filtering caused by the camera acquisition process
![Page 11: DoCoMo USA Labs All Rights Reserved Sandeep Kanumuri, NML Fast super-resolution of video sequences using sparse directional transforms* Sandeep Kanumuri.](https://reader033.fdocuments.in/reader033/viewer/2022051516/56649d425503460f94a1e02d/html5/thumbnails/11.jpg)
11DoCoMo USA Labs All Rights Reserved Sandeep Kanumuri, NML
0 N-1k
(Sparse Decomposition Domain)(Signal Domain)
S(k)
+T
-T
0 N-1n
s(n)
0 N-1k
C(k)^
(Denoised)
Core idea – Exploit Signal Sparsity
S(k)
0 N-1k
+ W(k)C(k) =
“noise”
![Page 12: DoCoMo USA Labs All Rights Reserved Sandeep Kanumuri, NML Fast super-resolution of video sequences using sparse directional transforms* Sandeep Kanumuri.](https://reader033.fdocuments.in/reader033/viewer/2022051516/56649d425503460f94a1e02d/html5/thumbnails/12.jpg)
12DoCoMo USA Labs All Rights Reserved Sandeep Kanumuri, NML
• Transform size: 4x4 (used for description), 3x3
• Transform used: DCT, Hadamard• For an Over-complete Transform
– all possible 4x4 blocks in the image/frame are selected using a non-directional mask
– Each 4x4 block undergoes a transform to produce a set of transformed coefficients
– Each pixel is involved in multiple transforms (16, on the average)
– Total number of transformed coefficients ~ 16 x number of pixels
• Directional Over-complete Transform– Here, each of the 4x4 blocks is formed
by applying a directional mask followed by a warping process (see next slide)
Block (1,1)
Block (2,1)
Block (H-3,1) Block (H-3,2) Block (H-3,W-3)
Block (1,2) Block (1,W-3)
Block (2,2) Block (2,W-3)
…
…
…
… … …
Blocks of an Over-complete Transform
H = Height of image; W = Width of image
Non-directional mask used to select a 4x4 block
Over-complete Transform
![Page 13: DoCoMo USA Labs All Rights Reserved Sandeep Kanumuri, NML Fast super-resolution of video sequences using sparse directional transforms* Sandeep Kanumuri.](https://reader033.fdocuments.in/reader033/viewer/2022051516/56649d425503460f94a1e02d/html5/thumbnails/13.jpg)
13DoCoMo USA Labs All Rights Reserved Sandeep Kanumuri, NML
but violated on directional edges
Signal sparsity in DCT domain holds for horizontal
and veritcal edges
Non-directional mask
Directional masks
Transform domain: 4x4 DCT
Transform support is warped
Animated Slide, Please use slide show mode
Let us consider 4 blocks along the edge- First, using Non-directional masks- Now, using Directional masks- Directional masks lead to sparse representation
For Directional Over-complete Transform, Directional masks replace the Non-directional mask
Warped (Directional) Transform
![Page 14: DoCoMo USA Labs All Rights Reserved Sandeep Kanumuri, NML Fast super-resolution of video sequences using sparse directional transforms* Sandeep Kanumuri.](https://reader033.fdocuments.in/reader033/viewer/2022051516/56649d425503460f94a1e02d/html5/thumbnails/14.jpg)
14DoCoMo USA Labs All Rights Reserved Sandeep Kanumuri, NML
• Decision made for a block (4x4) of pixels– At each pixel, a vote is cast for the mask that minimizes the signal
variance along the mask direction.
– The mask with the most votes is chosen
• Reduces inconsistency in directions
How to choose a mask?
Example masks
![Page 15: DoCoMo USA Labs All Rights Reserved Sandeep Kanumuri, NML Fast super-resolution of video sequences using sparse directional transforms* Sandeep Kanumuri.](https://reader033.fdocuments.in/reader033/viewer/2022051516/56649d425503460f94a1e02d/html5/thumbnails/15.jpg)
15DoCoMo USA Labs All Rights Reserved Sandeep Kanumuri, NML
Over-complete Inverse Transform
• For an Over-complete Inverse Transform– Each set of transformed coefficients is converted back to pixel domain
– Each pixel has multiple estimates from different blocks and a weighted combination is used to arrive at its final estimate
W1 W2 W3
and so on with all the blocks….
![Page 16: DoCoMo USA Labs All Rights Reserved Sandeep Kanumuri, NML Fast super-resolution of video sequences using sparse directional transforms* Sandeep Kanumuri.](https://reader033.fdocuments.in/reader033/viewer/2022051516/56649d425503460f94a1e02d/html5/thumbnails/16.jpg)
16DoCoMo USA Labs All Rights Reserved Sandeep Kanumuri, NML
Adaptive Thresholding
• Transform coefficients are thresholded for denoising
• A master threshold ( ) is used for an initial pass
• A local threshold ( ) is calculated and finally used– Elost: Energy lost due to thresholding when is used as threshold.
TEfT lost ˆ
• Parameters f1 to fn and E1 to En are tuned to achieved a local optimum
1
f1
f2
fn
(0,0) E2E1 En
Elost
f()
T
T
T̂
![Page 17: DoCoMo USA Labs All Rights Reserved Sandeep Kanumuri, NML Fast super-resolution of video sequences using sparse directional transforms* Sandeep Kanumuri.](https://reader033.fdocuments.in/reader033/viewer/2022051516/56649d425503460f94a1e02d/html5/thumbnails/17.jpg)
17DoCoMo USA Labs All Rights Reserved Sandeep Kanumuri, NML
Enforcing Data Consistency
• Role of data consistency module – Ensure that the high-resolution estimate, when downsampled, can
produce the low-resolution input.
Data Consistency module
Downsampling FilterLinear Interpolation
FilterHigh-resolution Input
Low-resolution Input
High-resolution Output
+
+
_
+
![Page 18: DoCoMo USA Labs All Rights Reserved Sandeep Kanumuri, NML Fast super-resolution of video sequences using sparse directional transforms* Sandeep Kanumuri.](https://reader033.fdocuments.in/reader033/viewer/2022051516/56649d425503460f94a1e02d/html5/thumbnails/18.jpg)
18DoCoMo USA Labs All Rights Reserved Sandeep Kanumuri, NML
Performance Comparison
• Super-resolution of QCIF to CIF sequences– Low pass filter from Daubechies 7/9 wavelet filter bank
– Compression is done using H.264/AVC codec (JM12.0)
• SWAT run with 2 iterations
• Compared with– Bilinear interpolation
– H.264 interpolation
– Simple Inverse
– Iterated Denoising / Shrinkage (ID)• 2 iterations (similar complexity compared to SWAT)
• 10 iterations
![Page 19: DoCoMo USA Labs All Rights Reserved Sandeep Kanumuri, NML Fast super-resolution of video sequences using sparse directional transforms* Sandeep Kanumuri.](https://reader033.fdocuments.in/reader033/viewer/2022051516/56649d425503460f94a1e02d/html5/thumbnails/19.jpg)
19DoCoMo USA Labs All Rights Reserved Sandeep Kanumuri, NML
PSNR comparison (uncompressed)
![Page 20: DoCoMo USA Labs All Rights Reserved Sandeep Kanumuri, NML Fast super-resolution of video sequences using sparse directional transforms* Sandeep Kanumuri.](https://reader033.fdocuments.in/reader033/viewer/2022051516/56649d425503460f94a1e02d/html5/thumbnails/20.jpg)
20DoCoMo USA Labs All Rights Reserved Sandeep Kanumuri, NML
PSNR comparison (uncompressed)
![Page 21: DoCoMo USA Labs All Rights Reserved Sandeep Kanumuri, NML Fast super-resolution of video sequences using sparse directional transforms* Sandeep Kanumuri.](https://reader033.fdocuments.in/reader033/viewer/2022051516/56649d425503460f94a1e02d/html5/thumbnails/21.jpg)
21DoCoMo USA Labs All Rights Reserved Sandeep Kanumuri, NML
PSNR comparison (uncompressed)
![Page 22: DoCoMo USA Labs All Rights Reserved Sandeep Kanumuri, NML Fast super-resolution of video sequences using sparse directional transforms* Sandeep Kanumuri.](https://reader033.fdocuments.in/reader033/viewer/2022051516/56649d425503460f94a1e02d/html5/thumbnails/22.jpg)
22DoCoMo USA Labs All Rights Reserved Sandeep Kanumuri, NML
H264ID (2 iterations)SWAT
Visual Comparison (uncompressed)
![Page 23: DoCoMo USA Labs All Rights Reserved Sandeep Kanumuri, NML Fast super-resolution of video sequences using sparse directional transforms* Sandeep Kanumuri.](https://reader033.fdocuments.in/reader033/viewer/2022051516/56649d425503460f94a1e02d/html5/thumbnails/23.jpg)
23DoCoMo USA Labs All Rights Reserved Sandeep Kanumuri, NML
H264ID (2 iterations)SWAT
Visual Comparison (uncompressed)
![Page 24: DoCoMo USA Labs All Rights Reserved Sandeep Kanumuri, NML Fast super-resolution of video sequences using sparse directional transforms* Sandeep Kanumuri.](https://reader033.fdocuments.in/reader033/viewer/2022051516/56649d425503460f94a1e02d/html5/thumbnails/24.jpg)
24DoCoMo USA Labs All Rights Reserved Sandeep Kanumuri, NML
PSNR comparison (compression at QP=20)
![Page 25: DoCoMo USA Labs All Rights Reserved Sandeep Kanumuri, NML Fast super-resolution of video sequences using sparse directional transforms* Sandeep Kanumuri.](https://reader033.fdocuments.in/reader033/viewer/2022051516/56649d425503460f94a1e02d/html5/thumbnails/25.jpg)
25DoCoMo USA Labs All Rights Reserved Sandeep Kanumuri, NML
PSNR comparison (compression at QP=25)
![Page 26: DoCoMo USA Labs All Rights Reserved Sandeep Kanumuri, NML Fast super-resolution of video sequences using sparse directional transforms* Sandeep Kanumuri.](https://reader033.fdocuments.in/reader033/viewer/2022051516/56649d425503460f94a1e02d/html5/thumbnails/26.jpg)
26DoCoMo USA Labs All Rights Reserved Sandeep Kanumuri, NML
H264SWAT
Visual Comparison (compression at QP=25)
![Page 27: DoCoMo USA Labs All Rights Reserved Sandeep Kanumuri, NML Fast super-resolution of video sequences using sparse directional transforms* Sandeep Kanumuri.](https://reader033.fdocuments.in/reader033/viewer/2022051516/56649d425503460f94a1e02d/html5/thumbnails/27.jpg)
27DoCoMo USA Labs All Rights Reserved Sandeep Kanumuri, NML
Visual Comparison (compression at QP=25)
H264SWAT
![Page 28: DoCoMo USA Labs All Rights Reserved Sandeep Kanumuri, NML Fast super-resolution of video sequences using sparse directional transforms* Sandeep Kanumuri.](https://reader033.fdocuments.in/reader033/viewer/2022051516/56649d425503460f94a1e02d/html5/thumbnails/28.jpg)
28DoCoMo USA Labs All Rights Reserved Sandeep Kanumuri, NML
Conclusion
• SWAT algorithm renders high quality output and yet remains fast– Quality comparable to ID (10 iterations)– Complexity comparable to ID (2 iterations)
• Enabling Features– Over-complete transform representation– Simple basic transform (Hadamard, Integer DCT)– Sparse warped transform– Adaptive thresholding– Weighted inverse transform
• Reference– S. Kanumuri, O. G. Guleryuz and M. R. Civanlar, "Fast super-resolution
reconstructions of mobile video using warped transforms and adaptive thresholding", SPIE Applications of Digital Image Processing XXX , August 2007
• Flicker Reduction Application– To appear in SPIE 2008 (Applications of Digital Image Processing XXXI)
• E-mail:– Sandeep Kanumuri ([email protected])– Onur G. Guleryuz ([email protected])