IMPLEMENTATION OF BLOCK MATCHING ALGORITHM BASED ...
Transcript of IMPLEMENTATION OF BLOCK MATCHING ALGORITHM BASED ...
IMPLEMENTATION OF BLOCK MATCHING ALGORITHM BASED ON
HEXAGON SHAPE SEARCH PATTERN FOR MOTION ESTIMATION IN
VIDEO CODING APPLICATION
MUHAMMAD GHIFFARI REDHA BIN RAMLI
UNIVERSITI TEKNIKAL MALAYSIA MELAKA
IMPLEMENTATION OF BLOCK MATCHING ALGORITHM BASED ON
HEXAGON SHAPE SEARCH PATTERN FOR MOTION ESTIMATION IN
VIDEO CODING APPLICATION
MUHAMMAD GHIFFARI REDHA BIN RAMLI
This Report Is Submitted In Partial Fulfillment Of Requirements For The
Bachelor Degree of Electronic Engineering (Telecommunication Electronics)
Faculty Of Electronic And Computer Engineering
Universiti Teknikal Malaysia Melaka
June 2012
UNIVERSTI TEKNIKAL MALAYSIA MELAKA FAKULTI KEJURUTERAAN ELEKTRONIK DAN KEJURUTERAAN KOMPUTER
BORANG PENGESAHAN STATUS LAPORAN
PROJEK SARJANA MUDA II
Tajuk Projek :
IMPLEMENTATION OF BLOCK MATCHING ALGORITHM BASED ON HEXAGON SHAPE
SEARCH PATTERN FOR MOTION ESTIMATION IN VIDEO CODING APPLICATION
Sesi Pengajian :
0 8 / 1 2
Saya MUHAMMAD GHIFFARI REDHA BIN RAMLI mengaku membenarkan Laporan Projek Sarjana Muda ini disimpan di Perpustakaan dengan syarat-syarat kegunaan seperti berikut:
1. Laporan adalah hakmilik Universiti Teknikal Malaysia Melaka.
2. Perpustakaan dibenarkan membuat salinan untuk tujuan pengajian sahaja.
3. Perpustakaan dibenarkan membuat salinan laporan ini sebagai bahan pertukaran antara institusi
pengajian tinggi.
4. Sila tandakan ( √ ) :
SULIT* *(Mengandungi maklumat yang berdarjah keselamatan atau kepentingan Malaysia seperti yang termaktub di dalam AKTA RAHSIA RASMI 1972)
TERHAD** **(Mengandungi maklumat terhad yang telah ditentukan oleh organisasi/badan di mana penyelidikan dijalankan)
TIDAK TERHAD
Disahkan oleh:
__________________________ ___________________________________ (TANDATANGAN PENULIS) (COP DAN TANDATANGAN PENYELIA)
Tarikh :………………………………
iii
“I hereby declare that this report is the result of my own work except for quotes as
cited in the reference”
Signature : ………………………………….
Author : MUHAMMAD GHIFFARI REDHA BIN RAMLI
Date : 15 JUNE 2012
iv
“I hereby declare that I have read this report and in my opinion this report is
sufficient in terms of the scope and quality for the award of Bachelor of Electronic
Engineering (Telecommunication Electronics) With Honours”
Signature : ……………………………….
Supervisor Name : MR REDZUAN BIN ABDUL MANAP
Date : 15 JUNE 2012
v
“For my lovely mom and dad”
vi
ACKNOWLEDGEMENT
Alhamdulillah, In the name of Allah, the Most Gracious and the Most Merciful,
All praises to Allah for the strengths and His blessing in completing this project. This
project would not have been possible without the guidance and the help of several
individuals who in one way or another way that contributed and extended their valuable
assistance for preparation and completion of this project.
First and foremost, Special appreciation goes to my supervisor, Mr. Redzuan Bin
Abdul Manap, for his supervision, encouragement, advice and ideas that have been
poured out to me throughout this project.
I also wish to express my appreciation to all my lecturers and my friends,
especially Mohd Helmi and Mohd Shahlan for sharing the knowledge and similar
research interest and also for their help and encouragement.
Last but not least, my deepest gratitude goes to my beloved parents Mr. Ramli
Bin Hamdan and Mrs. Siti Amnah Binti Hamzah for their unfailing encouragement and
financial support that they have given to me for over the years and also to all my sisters
and brothers for their endless love, prayers and encouragement. To those who indirectly
contributed in this project, your kindness means a lot to me. Thank you very much.
vii
ABSTRACT
Video commonly refers to transmission and storage formats for moving pictures.
Different types of video produced different image quality and video size. Those different
characteristic can be determined based on the compression technique. In compression
process, there is one important module called block motion estimation. Block matching
algorithm (BMA) is one way in motion estimation (ME) for finding minimum motion
vector (MV). BMA consists of Full Search (FS) algorithm and Fast search algorithm.
Conventionally FS algorithm has been used for BMA where it exhaustively evaluates all
possible blocks over the determined search window to find the best match. However,
this method has high computational cost which makes encoding process slower and it is
not practical for ‘power-limited’ device. In order to increase speed during encoding,
many fast search algorithm has been implemented. In this project hexagon based
algorithm which is one of the fast search algorithms is chosen to be implemented using
MATLAB software.
viii
ABSTRAK
Video lazimnya merujuk kepada format penghantaran dan penyimpanan untuk
imej bergerak. Jenis video yang berlainan akan menghasilkan kualiti imej yang berbeza
serta saiz video yang berbeza. Perbezaan ini dapat dikenal pasti melalui jenis
pemampatan yang digunakan. Dalam proses pemampatan, terdapat satu modul penting
yang dipanggil anggaran gerakan blok. Algoritma penyamaan blok (BMA) adalah satu
kaedah digunakan dalam anggaran gerakan (ME) bagi mencari vector gerakan (MV)
yang paling rendah. BMA terbahagi kepada algoritma pencarian penuh (FS) dan
algoritma pencarian pantas. Secara konvensionalnya algoritma FS digunakan dalam
BMA dimana ianya menilai secara menyeluruh semua blok yang berada didalam
tetingkap carian yang ditentukan bagi mencari padanan yang terbaik. Namun begitu
kaedah ini mempunyai kos pengiraan yang tinggi seterusnya menjadikan proses
pengekodan lebih perlahan dan ia tidak praktikal untuk digunakan dengan peranti 'kuasa-
terhad'. Dalam usaha untuk meningkatkan kelajuan semasa pengekodan, banyak
algoritma carian pantas telah dibangunkan. Dalam projek ini algoritma heksagon yang
merupakan salah satu algoritma carian pantas telah dipilih untuk dilaksanakan dengan
menggunakan perisian MATLAB.
ix
CONTENTS
CHAPTER TITLE PAGE
PROJECT TITLE i
REPORT STATUS VERIFICATION FORM ii
STUDENT’S DECLARATION Iii
SUPERVISOR’S DECLARATION iv
DEDICATION v
ACKNOWLEDGEMENT vi
ABSTRACT………………………………………………… vii
ABSTRAK…………………………………………………... viii
CONTENT………………………………………………….. ix
LIST OF TABLES…………………………………………. xii
LIST OF FIGURES .……………………………………….. xiii
LIST OF ABBREVIATION……………………………….. xv
1.0 INTRODUCTION
1.1 Project backgrounds..................................................... 1
1.2 Problem statement…………………………………… 2
1.3 Objective…………………………………………….. 2
1.4 Scope of work……………………………………….. 3
1.5 Report structure……………………………………… 3
x
2.0 LITERATURE REVIEW
2.1 Video Compression………………………………….. 5
2.1.1 The Need of Compression………………... 6
2.1.2 Compression Standard…………………… 7
2.1.3 Video Compression Technique…………... 10
2.2 Motion Estimation…………………………………… 12
2.3 Block Matching Algorithm………………………….. 13
2.4 Full search BMA……………………………………. 15
2.5 Diamond search algorithm…………………………... 16
2.6 Cross Diamond Search Algorithm…………………… 18
2.7 Audio Video Interleave……………………………… 19
2.8 MATLAB Software…………………………………. 20
3.0 PROJECT METHODOLOGY
3.1 Literature review…………………………………….. 22
3.2 Uploading video sequence…………………………… 23
3.3 Extraction of frame………………………………….. 23
3.4 Block construction…………………………………… 23
3.5 Implementation of HEXBS Algorithm into the block.. 24
3.6 Construction of predicted frame…………………….. 24
3.7 Performance analysis………………………………… 24
4.0 HEXAGON BASED SEARCH ALGORITHM
4.1 Introduction…………………………………………. 25
4.2 HEXBS pattern………………………………………. 26
4.3 HEXBS algorithm development……………………... 26
5.0 RESULTS AND DISCUSSIONS
5.1 Video sequences.…………………………………….. 29
xi
5.2 PSNR comparison …………..………………………. 29
5.3 Search point and processing time comparison.……… 36
5.2 Visual comparison…………………………...………. 42
6.0 CONCLUSION…………………………………………….. 46
REFERENCES…………………………………………….. 48
APPENDIX…………………………………………………. 50
xii
LIST OF TABLE
TABLE TITLE PAGE
5.1 PSNR comparison (dB) 31
5.2 Average number search point comparison. 36
5.3 Processing time (s) comparison. 37
xiii
LIST OF FIGURE
FIGURE TITLE PAGE
2.1 Video sequence 10
2.2 Comparison between current and reference frame 11
2.3 Prediction error 11
2.4 Frame in search window 14
2.5 Diamond search pattern 17
2.6 LDSP and SDSP pattern 17
2.7 Cross and diamond search pattern 18
4.1 HEXBS pattern 26
4.2 HEXBS development 27
5.1 PSNR (dB) vs frame number for Akiyo. 32
5.2 PSNR (dB) vs frame number for News. 32
5.3 PSNR (dB) vs frame number for Salesman. 33
5.4 PSNR (dB) vs frame number for Coastguard. 34
5.5 PSNR (dB) vs frame number for Tennis. 35
5.6 PSNR (dB) vs frame number for Foreman. 35
5.7 Frame number vs Search point for Akiyo. 38
5.8 Frame number vs Search point for News. 38
5.9 Frame number vs Search point for Salesman. 39
5.10 Frame number vs Search point for Coastguard. 40
5.11 Frame number vs Search point for Tennis. 41
xiv
5.12 Frame number vs Search point for Foreman. 41
5.13 Original frame for “Akiyo” 42
5.14 Predicted frame for “Akiyo” 42
5.15 Original frame for “News” 43
5.16 Predicted frame for “News” 43
5.17 Original frame for “Salesman” 44
5.18 Predicted frame for “Salesman” 44
5.19 Original frame for “Coastguard” 44
5.20 Predicted frame for “Coastguard” 44
5.21 Original frame for “Tennis” 45
5.22 Predicted frame for “Tennis” 45
5.23 Original frame for “ Foreman ” 45
5.24 Predicted frame for “ Foreman ” 45
xv
LIST OF ABBREVIATIONS
ATM - Asynchronous Time Multiplexing
AVI - Audio Video Interleave
BMA - Block Matching Algorithm
CCB Cross-Center Biased
CDS - Cross-Diamond Search
CIF - Common Intermediate Format
CSP - Cross-shaped Pattern
DCT - Discrete Cosine Transform
DS - Diamond Search
DVD - Digital Versatile Disk
FS - Full Search
HDTV - High-definition Television
HEXBS - Hexagon based search algorithm
ISDN - Integrated Services Digital Network
ISO International Organization Standardization
ITU - International Telecommunication Union
JPEG - Joint Photographic Experts Group
LAN - Local Area Network
LDSP - Large Diamond Search Pattern
MAD - Mean Absolute Difference
MBD - Minimum Block Distortion
xvi
ME - Motion Estimation
MPEG - Moving Picture Experts Group
MSE - Mean Squared Error
MV - Motion Vector
NCCF - Normalized Cross-Correlation Function
PSNR - Peak-to-Noise-Ratio
QCIF - Quarter Common Intermediate Format
SAD - Sum of Absolute Difference
SDSP - Small Diamond Search Pattern
WAN - Wide Area Network
1
CHAPTER 1
INTRODUCTION
1.1 Project background
To achieve high compression ratio in video coding, a technique known as Block
Matching Motion Estimation has been widely adopted in various coding standards. This
technique is implemented conventionally by exhaustively testing all the candidate blocks
within the search window. This type of implementation, called Full Search (FS)
Algorithm, gives the optimum solution. However, substantial amount of computational
workload is required in this algorithm. To overcome this drawback, many fast Block
Matching Algorithms (BMA‘s) have been proposed and developed. Different search
patterns and strategies are exploited in these algorithms in order to find the optimum
Motion Vector (MV) with minimal number of required search points.
One of these fast BMA‘s, which is proposed to be implemented in this project, is
called Hexagon Based Search (HEXBS) Algorithm. The algorithm is to be stimulated in
MATLAB and its corresponding performance results to be compared to FS algorithm as
2
well as to other fast BMA‘s in terms of the peak signal-to-noise ratio (PSNR) produced,
number of search points required and computational complexity.
1.2 Problem statement
Implementation of FS algorithm in motion estimation (ME) process required
substantial amount of computational workload. However this drawback can be overcome
by many types of fast BMA‘s which have been proposed and developed recently.
Different search patterns and strategies are exploited in these fast BMA algorithms in
order to find the optimum MV with minimal number of required search point.
1.3 Objective
Apart from solving FS algorithm weaknesses in ME process, the other objectives of this
project are:
i) To develop, implement and analyze HEXBS algorithm by using MATLAB.
ii) To compare the performance of HEXBS algorithm to FS algorithm as well as
other common fast BMA’s.
iii) To produce a working and functional MATLAB coding.
3
1.4 Scope of work
The project has 3 distinct scopes which are:
I. Background Study
Further reading on video or image compression, ME, BMA and HEXBS
Algorithm to gain understanding and sound knowledge on field interest.
II. Algorithm Development
The HEXBS algorithm is implemented and simulated using MATLAB.
III. Perfomance Analysis
The perfomance of HEXBS algorithm is compared with existing fast block
matching motion estimation algorithms to see the suitability of the alforithm for
ME process in video coding techniques.
1.5 Report structure
The first chapter in this report is introduction. I will have a few sub-topics, which
are project background, problem statement, objectives and scope of work. In
project background brief discussion on video compression, ME, BMA and HEXBS
algorithm is presented. Next, problems of FS algorithm which give the reason for work
in this project are explained. Consequently, it leads to the purpose of this. In order to
achieve the objectives of this project, scope of work has been determined and
described too.
4
Second chapter in this report is literature review. This chapter is divided into a
few of sub chapter that contain fundamental information related to the video
compression, ME, BMA and several type of fast BMA available as well as video format
and software tool used in this project.
Next chapter is project methodology. This part explains about guidelines and
how for this project where it mentioned in detail steps that should be taken in order to
complete the project. Project starts with literature review before implementation of
HEXBS algorithm in MATLAB and conclude with performance analysis.
The fourth chapter focusing on HEXBS algorithm. This chapter will explain
introduction of HEXBS, search pattern and development steps of HEXBS algorithm in
much more details.
The fifth chapter is the result and discussion. The simulation results of HEXBS
algorithm obtained in this project is presented and analyzed in comparison to FS
algorithm and other fast BMA’s.
The final chapter in this report is conclusion. The author concludes the overall
project work and put recommendations for future work.
5
CHAPTER 2
LITERATURE REVIEW
2.1 Video Compression
The volume of information required to represent a high quality Digital Video is
usually very large, which becomes a challenge for Internet distribution of high volumes
of content. This is basically why the data must be compressed. Uncompressed footage
from a camcorder takes about 17MB per second of video. Because it takes so much
space, video must be compressed before it is put on the web [1]. Compressions mean
that the size of data has been reducing in order to save space or transmission time.
Compression can be either lossy compression or lossless compression.
6
Lossy compression means that the compressed file has less data in it than the
original file. In some cases this translates to lower quality files, because information has
been lost. However, at the time the difference started to be noticed a relatively large
amount of data have been lost. Lossy compression makes up for the loss in quality by
producing comparatively small files. For example, digital versatile disk (DVDs) are
compressed using the MPEG-2 format, which can make files 15 to 30 times smaller, but
people still tend to perceive DVDs as having high-quality picture.
Lossless compression is exactly what it sounds like, compression where none of
the information is lost. This is not nearly as useful because files often end up being the
same size as they were before compression. This may seem pointless, as reducing the
file size is the primary goal of compression. However, if file size is not an issue, using
lossless compression will result in a perfect-quality picture. For example, a video editor
transferring files from one computer to another using a hard drive might choose to use
lossless compression to preserve quality while he or she is working.
2.1.1 The Need of Compression
A single digital television signal in CCIR 601 format requires a transmission rate
of 216 Mbps. This bit rate is too high for most existing practical communication
networks. For example, most local area networks (LANs) offer data transmission at rates
on the order of 10 Mbps, and most wide area networks (WANs) support much lower
data rates than this. The emerging Asynchronous Time Multiplexing (ATM) networks
are capable of transmitting higher bit rates. However, distributing an uncompressed
CCIR 601 bit stream over these networks is still prohibitively expensive [2].
7
This shows that the digital video information should be compressed or encoded
prior to transmission to accommodate for different transmission media‘s capabilities. At
the receiver‘s end, the compressed bit stream received is first decompressed or decoded
and then displayed. A number of video coding techniques and standards have been
developed within the last few years that exploit the inherent redundancy in still images
and moving video sequences to provide significant data compression.
2.1.2 Compression Standard
The Moving Picture Experts Group (MPEG) is a working group of experts that
was formed by ISO and IEC to set standards for audio and video compression and
transmission. It was established in 1988 by the initiative of Hiroshi Yasuda (Nippon
Telegraph and Telephone) and Leonardo Chiariglione. The MPEG standards consist of
different Parts. They include [3]:
i. MPEG-1
Designed for up to 1.5 Mbit/sec Standard for the compression of moving pictures
and audio. This was based on CD-ROM video applications, and is a popular
standard for video on the Internet, transmitted as .mpg files. In addition, level 3
of MPEG-1 is the most popular standard for digital compression of audio known
as MP3. MPEG-1 is the standard of compression for VideoCD, the most popular
video distribution format thoughout much of Asia.
ii. MPEG-2
Designed for between 1.5 and 15 Mbit/sec Standard on which Digital Television
set top boxes and DVD compression is based. It is based on MPEG-1, but
designed for the compression and transmission of digital broadcast television.