Summer Internship at STMicroelectronics, Greater Noida, India · Introduction At rst, my work has...

6
Summer Internship at STMicroelectronics, Greater Noida, India Thibault Lefeuvre, Ecole Polytechnique, France August 2015 Techniques of frame compression in ultra low power encoder Introduction At first, my work has consisted in reading articles and documents on H.264 and frame compression algorithms ([Ric10], [Zhe06], [Hoi06], [Qin10]) since my knowledge of the issue was very basic. I have then studied the different metrics currently used to compare and assess the similarities between co- located macroblocks ([Zho02] [Zho04], [Chu09]) which I thought would be at the heart of the project. After discussion, I have focused myself on the Intra-only profile along with PSkip mode decision and on early-skip prediction as [Ref] puts forth. The idea was to build a mathematical background to this solution which proved efficient but lacked a theoretical background. Following [Ref], I’ve tried to design an algorithm assessing the similarities between co-located macroblocks without explicitly referring to the previous frame, thus saving important DDR costs. Algorithm The algorithm can be described as follows : Frame N is transformed, using an edge detection kernel, into a black and white image where white pixels are edges or corners and black pixels form the background. The matrix obtained is a sparse one. Macroblocks of frame N are stored as follow : if the current transformed MB contains white pixels (edges or corners), then it is stored using a sparse matrix compression algorithm ; if not, the average luma of the original MB is stored. The current MB is compared to the co-located MB of frame N - 1 : if both contain white pixels, a SAD between the black and white macroblocks is computed and if it is below a parameter threshold1, the current MB is skipped, otherwise MB is encoded using Intra prediction ; if they are both identified as ”background” (only black pixels), absolute difference between average luma is computed and if it is below a parameter threshold2, current MB is skipped, otherwise MB is encoded using Intra prediction ; eventually, if one MB contains white pixels but the co-located MB doesn’t, the current MB is encoded using Intra prediction. Afterwards, the algorithm jumps to frame N + 1. As far as greyscale images are concerned, instead of storing N × M pixel-macroblocks, i.e. N × M bytes (assessing that one pixel is encoded on one byte), the algorithm will either store a real floating number or a sparse matrix which compresses on average between 90% to 95% the macroblock. Overall, the data required to asses the similarity between the macroblocks of two consecutive frames is much reduced. 1

Transcript of Summer Internship at STMicroelectronics, Greater Noida, India · Introduction At rst, my work has...

Page 1: Summer Internship at STMicroelectronics, Greater Noida, India · Introduction At rst, my work has consisted in reading articles and documents on H.264 and frame compression ... bad

Summer Internship at STMicroelectronics,

Greater Noida, India

Thibault Lefeuvre, Ecole Polytechnique, France

August 2015

Techniques of frame compression in ultra low power encoder

Introduction

At first, my work has consisted in reading articles and documents on H.264 and frame compressionalgorithms ([Ric10], [Zhe06], [Hoi06], [Qin10]) since my knowledge of the issue was very basic. I havethen studied the different metrics currently used to compare and assess the similarities between co-located macroblocks ([Zho02] [Zho04], [Chu09]) which I thought would be at the heart of the project.After discussion, I have focused myself on the Intra-only profile along with PSkip mode decision andon early-skip prediction as [Ref] puts forth. The idea was to build a mathematical background to thissolution which proved efficient but lacked a theoretical background. Following [Ref], I’ve tried to designan algorithm assessing the similarities between co-located macroblocks without explicitly referring tothe previous frame, thus saving important DDR costs.

Algorithm

The algorithm can be described as follows :

• Frame N is transformed, using an edge detection kernel, into a black and white image wherewhite pixels are edges or corners and black pixels form the background. The matrix obtained isa sparse one.

• Macroblocks of frame N are stored as follow : if the current transformed MB contains whitepixels (edges or corners), then it is stored using a sparse matrix compression algorithm ; if not,the average luma of the original MB is stored.

• The current MB is compared to the co-located MB of frame N −1 : if both contain white pixels,a SAD between the black and white macroblocks is computed and if it is below a parameterthreshold1, the current MB is skipped, otherwise MB is encoded using Intra prediction ; if theyare both identified as ”background” (only black pixels), absolute difference between average lumais computed and if it is below a parameter threshold2, current MB is skipped, otherwise MB isencoded using Intra prediction ; eventually, if one MB contains white pixels but the co-locatedMB doesn’t, the current MB is encoded using Intra prediction.

• Afterwards, the algorithm jumps to frame N + 1.

As far as greyscale images are concerned, instead of storing N ×M pixel-macroblocks, i.e. N ×Mbytes (assessing that one pixel is encoded on one byte), the algorithm will either store a real floatingnumber or a sparse matrix which compresses on average between 90% to 95% the macroblock. Overall,the data required to asses the similarity between the macroblocks of two consecutive frames is muchreduced.

1

Page 2: Summer Internship at STMicroelectronics, Greater Noida, India · Introduction At rst, my work has consisted in reading articles and documents on H.264 and frame compression ... bad

Figure 1: Algorithm flowchart

Results

The algorithm was implemented in MATLAB. The intra-prediction mode was not taken into account :the sequences were analyzed and the macroblocks which could be skipped were detected. The outputframes are therefore made up of original macroblocks (non-skipped macroblocks) and macroblocksfrom the previous frame (skipped macroblocks).To illustrate the algorithm, we show the process of reconstruction of two frames taken from two dif-ferent sequences : the first sequence is a hand moving on a plan, and the second sequence is the faceof a man moving in the foreground. The pictures at the top are the initial pictures of the sequence(two following frames). In the middle, the images have been transformed using a canny edge detec-tor : only the macroblocks identified as ”background” remain in grayscale, whereas the macroblockswith edges or corners are turned into black and white. Pictures in the bottom, show the results aftercomputation. The parameters of the algorithm have a critical influence on the output, especially thethresholds used in the canny edge detector, and the thresholds used after computing SAD. A goodchoice of parameters allow both an efficient compression (i.e. a lot of macroblocks are skipped) and arobust visual quality (the distortion is minimum for the human vision system). On the other hand, abad set of parameters can either lead to an effective compression along with a poor video quality orto a deficient compression along with an acceptable video quality.

We then display a sequence of six frames showing two cars moving in the foreground. The origi-nal frames are (1)-(6) and algorithm has been processed on frames (a)-(f). The frames are made of32 × 32 = 1024 macroblocks (blocks of 16 × 16 pixels where one pixel is coded on one byte). Thearray below shows the compression efficiency of the algorithm on the data required to assess similar-ities between co-located macroblocks. We have made a distinction between macroblocks compressedas ”backgrounds” and macroblocks compressed as ”edges” since the storage process is different. Fora MB stored as ”background”, 4 bytes are required instead of 256 (16 × 16) and for a MB storedas ”edges”, the use of a sparse matrix compression algorithm can empirically save up to 95-99% ofmemory but we will consider 90% in order to give a lower bound estimation so that a MB compressed

2

Page 3: Summer Internship at STMicroelectronics, Greater Noida, India · Introduction At rst, my work has consisted in reading articles and documents on H.264 and frame compression ... bad

in this way will be considered to be stored on 26 bytes. The percentage of compression is calculatedas the total bytes required divided by the total bytes encoding the frame.

Frame 1 2 3 4 5 6Macroblocks identified as ”edges” 179 185 208 220 220 218

Macroblocks identified as ”background” 845 839 816 804 804 806Total bytes required 8034 8166 8672 8936 8936 8892

Percentage of compression 96.93% 96.88% 96.69% 96.59% 96.59% 96.61%

Artifacts and improvement

Tests still need to be carried out with a larger database in order to assess the validity of the algorithm.Overall, simple shapes (faces, hands, cars, ...) prove to be better processed by the algorithm thandetailed areas (landscapes). Artifacts appear in regions where the edge detection algorithm fails tounderline the structure of the image, either because it is too noisy, or because the parameters were notwell chosen, or because of a lack of contrast. To solve this problem, the solution may be to identifyanother kernel which would be efficient in processing these specific areas.

References

[Chu09] Chun-Ling Yang, Rong-Kun Leung1, Lai-Man Po, Zhi-Yi Mai. An SSIM-Optimal H.264/AVCInter Frame Encoder. IEEE, 2009.

[Hoi06] Hoi-Ming Wong, Oscar C. Au, Andy Chang, Shu-Kei Yip, Chi-Wang Ho. Fast Mode Decisionand Motion Estimation for H.264 (FMDME). IEEE, 2006.

[Qin10] Qin Liu, Takeshi Ikenaga. Rate-Distortion Optimization Based Skip Mode Early Detectionin H.264. International Symposium on Intelligent Signal Processing and Communication Sys-tems, December 2010.

[Ref] A Novel PSkip determination Algorithm for WiGig encoder.

[Ric10] Iain E. Richardson. The H.264 Advanced Video Compression Standard, Second Edition. Wiley,2010.

[Zhe06] Zhenyu Wei, King Ngi Ngan. A Fast Macroblock Mode Decision Algorithm for H.264. IEEE,2006.

[Zho02] Zhou Wang, Alan C. Bovik. A Universal Image Quality Index. IEEE Signal ProcessignLetters, Vol. 9, No. 3, March 2002.

[Zho04] Zhou Wang, Alan C. Bovik, Hamid R. Sheikh, Eero P. Simoncelli. Image Quality Assessment:From Error Visibility to Structural Similarity. IEEE Transactions on Image Processing, Vol.13, No. 4, April 2004.

3

Page 4: Summer Internship at STMicroelectronics, Greater Noida, India · Introduction At rst, my work has consisted in reading articles and documents on H.264 and frame compression ... bad

Figure 2: (a) First original frame, (b) Second original frame, (c) First frame processed, (d) Secondframe processed, (e) Second frame where : black MBs are identified as ”background” and skipped,white MBs are identified as edges or corners and skipped, grayscale MBs are not skipped (encodedwith intra mode), (f) Second frame reconstructed : black and white MBs in (e) (skipped MBs) havebeen replaced with MBs from the first frame. The parameters used with the algorithm are : N =15, σ = 1, threshold1 = 6, threshold2 = 6, threshbw = 7, threshcol = 10.

4

Page 5: Summer Internship at STMicroelectronics, Greater Noida, India · Introduction At rst, my work has consisted in reading articles and documents on H.264 and frame compression ... bad

Figure 3: (a) First original frame, (b) Second original frame, (c) First frame processed, (d) Secondframe processed, (e) Second frame where : black MBs are identified as ”background” and skipped,white MBs are identified as edges or corners and skipped, grayscale MBs are not skipped (encodedwith intra mode), (f) Second frame reconstructed : black and white MBs in (e) (skipped MBs) havebeen replaced with MBs from the first frame. The parameters used with the algorithm are : N =15, σ = 1, threshold1 = 12, threshold2 = 8, threshbw = 50, threshcol = 10.

5

Page 6: Summer Internship at STMicroelectronics, Greater Noida, India · Introduction At rst, my work has consisted in reading articles and documents on H.264 and frame compression ... bad

Figure 4: Frames (1)-(6) are original and algorithm was processed on frames (a)-(f). The param-eters used with the algorithm are : N = 15, σ = 1, threshold1 = 15, threshold2 = 8, threshbw =40, threshcol = 10.

6