Final Project 112

8/8/2019 Final Project 112

1/32

Project Report

ON

VIDEO COMPRESSION

Submitted in partial fulfillment of the requirement for the degreeof

B. Tech. in Computer Science & Engineering.

Under Guidance of:Mr. Roshan Singh(Assistant Professor)

MANAV RACHNA COLLEGE OF ENGINEERING

FARIDABAD

BATCH (2007-2011)

1

Submitted By:Amit Saini (Roll No.)

Raj Kamal Sharma (Roll No.)

Sandeep Yadav (Roll No.)


2/32

Table Of Content

Chapter 1. Introduction

1.1 Objective(s) of the System/Tool 1

1.2 Scope of the System/Tool 2

1.3 Problem definition of the System/Tool ,3

1.4 Hardware and Software Requirements 3.

Chapter 2. Problem Analysis

2.1 Literature Survey .4

2.1.1 Introduction.........................................................................................4

2.1.2 VIDEO COMPRESSION TECHNOLOGY...............................5

2.1.3.COMPRESSION STANDARDS................................6

2.1.4.MPEG-1...............................................................................................9

2.1.5.MPEG-2..............................................................................................12

2.1.6.MPEG-4..............................................................................................17

2.1.7.MPEG-7..............................................................................................19

2.1.8.H.261....................................................................................................21

2.2 Methodology Adopted ..24

Chapter 3. Project Estimation and Implementation Plan

3.1 Cost and Benefit Analysis 26

3.2. Schedule Estimate 28

3.3 PERT Chart/ Gantt Chart 28

Reference s30

2


3/32

Chapter -1 Introduction

1.1. OBJECTIVE

1.1.1. NEED OF THE SYSTEM: Uncompressedxc video (and audio) data are huge. In

HDTV, the bit rate easily exceeds 1 Gbps. -- big problems for storage and network

communications. For example: One of the formats defined for HDTV broadcasting

within the United States is 1920 pixels horizontally by 1080 lines vertically, at 30

frames per second. If these numbers are all multiplied together, along with 8 bits for

each of the three primary colors, the total data rate required would be approximately 1.5

Gb/sec. Because of the 6 MHz. channel bandwidth allocated, each channel will only

support a data rate of 19.2 Mb/sec, which is further reduced to 18 Mb/sec by the fact

that the channel must also support audio, transport, and ancillary data information. As

can be seen, this restriction in data rate means that the original signal must be

compressed by a figure of approximately 83:1. This number seems all the more

impressive when it is realized that the intent is to deliver very high quality video to the

end user, with as few visible artifacts as possible.

1.1.2. OUTCOME OF THE SYSTEM:

Video Compressor is a multifunctional video compression software to help you compress

video files to smaller file size. With comprehensive video formats supported, plentiful profiles

and handy tools provided, this Video Compressor is the ideal video file compressor and video

size compressor .

A digital video compression system and its methods for compressing digitalized video signals

in real time. The system compressor receives digitalized video frames divided into subframes,

performs in a single pass a spatial domain to transform domain transformation in two

dimensions of the picture elements of each subframe, normalizes the resultant coefficients by a

normalization factor having a predetermined compression ratio component and an adaptive

rate buffer capacity control feedback component, to provide compression, encodes the

coefficients and stores them in a first rate buffer memory asynchronously at a high data transfer

3


4/32

rate from which they are put out at a slower, synchronous rate. The compressor adaptively

determines the rate buffer capacity control feedback component in relation to instantaneous

data content of the rate buffer memory in relation to its capacity, and it controls the absolute

quantity of data resulting from the normalization step so that the buffer memory is never

completely emptied and never completely filled. In expansion, the system essentially mirrors

the steps performed during compression. An efficient, high speed decoder forms an important

aspect of the present invention. The compression system forms an important element of a

disclosed color broadcast compression system.

Scope of the system / Tool

What will the tool be able to do?

The system tool is able into these topics:

Desktop Tools and Development Environment Startup and shutdown, arranging thedesktop, and using tools to become more productive with MATLAB

Data Import and Export Retrieving and storing data, memory-mapping, andaccessing Internet files

Mathematics Mathematical operationsData Analysis Data analysis, including data fitting, Fourier analysis, and time-series

toolsProgramming Fundamentals The MATLAB language and how to develop

MATLAB applicationsObject-Oriented Programming Designing and implementing MATLAB classesGraphics Tools and techniques for plotting, graph annotation, printing, and

programming with Handle Graphics objects3-D Visualization Visualizing surface and volume data, transparency, and viewing

and lighting techniquesCreating Graphical User Interfaces GUI-building tools and how to write callback

functionsExternal Interfaces MEX-files, the MATLAB engine, and interfacing to Sun

Microsystems Java software, Microsoft .NET Framework, COM, Web services, and theserial port

There is reference documentation for all MATLAB functions:

Function Reference Lists all MATLAB functions, listed in categories or alphabetically

Handle Graphics Property Browser Provides easy access to descriptions of graphicsobject properties

4


5/32

C/C++ and Fortran API Reference Covers functions used by the MATLAB externalinterfaces, providing information on syntax in the calling language, description, arguments,return values, and examples

The MATLAB application can read data in various file formats, discussed in the followingsections:

Recommended Methods for Importing DataImporting MAT-FilesImporting Text Data FilesImporting XML DocumentsImporting Excel SpreadsheetsImporting Scientific Data FilesImporting ImagesImporting Audio and VideoImporting Binary Data with Low-Level I/O

1.4 Hardware And Software Requirements:1.4.1 HARDWARE REQUIREMENTS:

512 MB RAM

10 GB HARD DISK

1.4.2 SOFTWARE REQUIREMENTS :

1. OPERATING SYSTEM WINDOWs or LINUX

2. MATLAB

5
http://www.mathworks.com/help/techdoc/import_export/braietb-1.htmlhttp://www.mathworks.com/help/techdoc/import_export/f5-86568.htmlhttp://www.mathworks.com/help/techdoc/import_export/f5-124245.htmlhttp://www.mathworks.com/help/techdoc/import_export/f5-132080.htmlhttp://www.mathworks.com/help/techdoc/import_export/f5-6031.htmlhttp://www.mathworks.com/help/techdoc/import_export/f5-86568.htmlhttp://www.mathworks.com/help/techdoc/import_export/f5-124245.htmlhttp://www.mathworks.com/help/techdoc/import_export/f5-132080.htmlhttp://www.mathworks.com/help/techdoc/import_export/f5-6031.htmlhttp://www.mathworks.com/help/techdoc/import_export/braietb-1.html


6/32

Chapter-2 Problem Analysis

2.1 Literature Survey

2.1.1 Introduction (Background Work):

Video compression typically operates on square-shaped groups of neighboring pixels, often

called macro blocks. These pixel groups or blocks of pixels are compared from one frame to

the next and the video compression codec (encode/decode scheme) sends only the difference

within those blocks. This works extremely well if the video has no motion. A still frame of

text, for example, can be repeated with very little transmitted data. In areas of video with more

motion, more pixels change from one frame to the next. When more pixels change, the video

compression scheme must send more data to keep up with the larger number of pixels that are

changing. If the video content includes an explosion, flames, a flock of thousands of birds, or

any other image with a great deal of high-frequency detail, the quality will decrease, or the

variable bit rate must be increased to render this added information with the same level of

detail.

Video is basically a three-dimensional array of color pixels. Two dimensions serve as spatial

(horizontal and vertical) directions of the moving pictures, and one dimension represents the

time domain. A data frame is a set of all pixels that correspond to a single time moment.Basically, a frame is the same as a still picture.

Some forms of data compression are lossless. This means that when the data is decompressed,

the result is a bit-for-bit perfect match with the original. While lossless compression of video is

possible, it is rarely used, as lossy compression results in far higher compression ratios at an

acceptable level of quality.

One of the most powerful techniques for compressing video is interframe compression.

Interframe compression uses one or more earlier or later frames in a sequence to compress the

current frame, while intraframe compression uses only the current frame, which is effectively

image compression.

The most commonly used method works by comparing each frame in the video with the

previous one. If the frame contains areas where nothing has moved, the system simply issues a

6


7/32


8/32

other methods (DCT) that work on smaller pieces of the desired data. The result is a

hierarchical representation of an image, where each layer represents a frequency band.

2.1.3 Compression Standards (Techniques for solving the problem):

MPEG stands for the Moving picture Expert Group.MPEG is an ISO/IEC working group,

established in 1988 to develop standards for digital audio and video formats. There are four

MPEG standards being used or in development. Each compression standard was designed with

a specific application and bit rate in mind, although MPEG compression scales well with

increased bit rates. They include:

2.1.3.1 MPEG-1

Designed for up to 1.5 Mbit/sec Standard for the compression of moving pictures and audio.

This was based on CD-ROM video applications, and is a popular standard for video on the

Internet, transmitted as .mpg files. In addition, level 3 of MPEG-1 is the most popular standard

for digital compression of audio--known as MP3. MPEG-1 is the standard of compression for

VideoCD, the most popular video distribution format thoughout much of Asia.

2.1.3.2 MPEG-2

Designed for between 1.5 and 15Mbit/sec Standard on which Digital Television set top boxes

and DVD compression is based. It is based on MPEG-1, but designed for the compression and

transmission of digital broadcast television. The most significant enhancement from MPEG-1

is its ability to efficiently compress interlaced video. MPEG-2 scales well to HDTV resolution

and bit rates, obviating the need for an MPEG-3.

2.1.3.4 MPEG-4

Standard for multimedia and Web compression. MPEG-4 is based on object-based

compression, similar in nature to the Virtual Reality Modeling Language. Individual objects

within a scene are tracked separately and compressed together to create an MPEG4 file. This

results in very efficient compression that is very scalable, from low bit rates to very high. It

8


9/32

also allows developers to control objects independently in a scene, and therefore introduce

interactivity.

2.1.3.5 MPEG7- this standard, currently under development, is also called the Multimedia

Content Description Interface. When released, the group hopes the standard will provide a

framework for multimedia content that will include information on content manipulation,

filtering and personalization, as well as the integrity and security of the content. Contrary to the

previous MPEG standards, which described actual content, MPEG-7 will represent information

about the content.

2.1.3.6 H.261:- H.261 is an ITU standard designed for two-way communication over ISDN

lines (video conferencing) and supports data rates which are multiples of 64Kbit/s. The

algorithm is based on DCT and can be implemented in hardware or software and uses

intraframe and interframe compression. H.261 supports CIF and QCIF resolutions.

MPEG4 advantages include high compression, low bit rate and motion compensation support.

Disadvantages are latency and blocking artifacts. JPEG, JPEG2000, and MPEG4 have all been

used in video surveillance systems, with the choice depending on what is most important in

that particular application. H.264 is an advanced compression scheme which is also starting to

find its way into video surveillance systems. H.264 offers compression at the expense of additional hardware complexity. It is not examined this paper, but FPGA-based solutions for

H.264.

2.1.3.7 MPEG21:

The MPEG-21 standard, from the Moving Picture Expert Group, aims at defining an open

framework for multimedia applications. MPEG-21 is ratified in the standards ISO/IEC 21000 -

Multimedia framework (MPEG-21).

MPEG-21 is based on two essential concepts:

definition of a Digital Item (a fundamental unit of distribution and transaction)

users interacting with Digital Items

9


10/32

Digital Items can be considered the kernel of the Multimedia Framework and the users can be

considered as who interacts with them inside the Multimedia Framework. At its most basic

level, MPEG-21 provides a framework in which one user interacts with another one, and the

object of that interaction is a Digital Item. Due to that, we could say that the main objective of

the MPEG-21 is to define the technology needed to support users to exchange, access,

consume, trade or manipulate Digital Items in an efficient and transparent way.

The storage of an MPEG-21 Digital Item in a file format based on the ISO base media file

format, with some or all of Digital Item's ancillary data (such as movies, images or other non-

XML data) within the same file.

2.1.3.8 H.263:

H.263 is a video compression standard originally designed as a low-bit rate compressed format

for video conferencing. It was developed by the (VCEG) ITU-T Video Coding Expert Group

(VCEG).

H.263 has since found many applications on the internet: much Flash video content (as used

on sites such as YouTube,Google Video,MySpace, etc.) used to be encoded in Sorenson Spark

format, though many sites now use VP6 or H.264 encoding.

H.263 was developed as an evolutionary improvement based on experience from H.261, the

previous ITU-T standard for video compression, and the MPEG-1 and MPEG-2 standards. Its

first version was completed in 1995 and provided a suitable replacement for H.261 at all bit

rates. It was further enhanced in projects known as H.263v2; MPEG-4 Part 2 is H.263

compatible in the sense that a basic H.263 bit stream is correctly decoded by an MPEG-4

Video decoder.

2.1.3.9 H.264:

10


11/32

The next enhanced codec developed by ITU-T VCEG (in partnership with MPEG) after H.263

is the H.264 standard, also known as AVC and MPEG-4 part 10. As H.264 provides a

significant improvement in capability beyond H.263, the H.263 standard is now considered a

legacy design. Most new videoconferencing products now include H.264 as well as H.263 and

H.261 capabilities. H.264 is used in such applications as players for Blu-ray Discs, videos from

You Tube and the iTunes Store, web software such as the Adobe Flash Player and Microsoft

Silverlight, broadcast services for DVB and SBTVD, direct-broadcast satellite television

services, cable television services, and real-time videoconferencing.

2.1.4 MPEG-1

The Coding of Moving Pictures and Associated Audio for Digital Storage Media at up to

about 1,5 Mbit/s (ISO/IEC 11172) or MPEG-1 as it is more commonly known as, standardizesthe storage and retrieval of moving pictures and audio storage media forms the basis for Video

CD and MP3 formats.

This part of the specification describes the coded representation for the compression of video

sequences.

The basic idea of MPEG video compression is to discard any unnecessary information i.e. an

MPEG-1 encoder by analyses:

1

How much movement there is in the current frame compared to the previous frame what

changes of color have taken place since the last frame what changes in light or contrast have

taken place since the last frame what elements of the picture have remained static since the last

frame

The encoder then looks at each individual pixel to see if movement has taken place, if there has

been no movement, the encoder stores an instruction to say to repeat the same frame or repeat

the same frame, but move it to a different position.

1.I intra-frame

2.B Bidirectional frames

3.P Predicted frames

11


12/32

Audio, video and time codes are converted into one single stream.

625 and 525 line

from 1 to 1.5Mbits/s

24-30 frames per second

MPEG-1 compression treats video as a sequence of separate images. Picture Elements, often

referred to as pixels are elements in the image. Each pixel consists of three components

Luminance/luminosity (Y) and two for chrominance Cb and Cr. MPEG-1 encodes Y pixels in

full (check the correct term) resolution as the Human Visual System (HVS) is most sensitive to

luminance/luminosity.1

Quantification

Predictive coding the difference between the predicted pixel value and the real value is coded.

Movement compensation (MC) predicts the value of a neighboring block of pixels (1 block =

8x8 pixels) in an image to those of a known block of pixels. A vector describes the 2-

dimensional movement. If no movement takes place, the value is 0.

2

Interframe coding

Sequential coding

VLC (Variable? Coding)

Image Interpolation

Intra (I frames) are coded independently of other images.

MPEG codes images progressively Interlaced images need to be converted into a de-interlaced

format before encoding, Video is encoded and Encoded video is converted into an interlaced

form.

To achieve a high compression ratio: An appropriate spatial resolution for the signal is

chosen/the image is broken down into different pixels; block-based motion compensation is

used to reduce the temporal redundancy.

12


13/32

1 Motion compensation is used for causal prediction of the current picture from a

previous picture, for non-causal prediction of the current picture from a future picture, or for

interpolative prediction from past and future pictures.

The difference signal, the prediction error, is further compressed using the discrete cosine

transform (DCT) to remove spatial correlation and is then quantized. Finally, the motion

vectors are combined with the DCT information, and coded using variable length codes.

MPEG-1 is a standard in 4 parts:

Part 1 addresses the problem of combining one or more data streams from the video and audio

parts of the MPEG-1 standard with timing information to form a single stream .This is an

important function because, once combined into a single stream, the data are in a form wellsuited to digital storage or transmission.

Part 2 specifies a coded representation that can be used for compressing video sequences -

both 625-line and 525-lines - to bitrates around 1,5 Mbit/s. Part 2 was developed to operate

principally from storage media offering a continuous transfer rate of about 1,5 Mbit/s.

Nevertheless it can be used more widely than this because the approach taken is generic. A

number of techniques are used to achieve a high compression ratio. The first is to select an

appropriate spatial resolution for the signal. The algorithm then uses block-based motion

compensation to reduce the temporal redundancy. Motion compensation is used for causal

prediction of the current picture from a previous picture, for non-causal prediction of the

current picture from a future picture, or for interpolative prediction from past and future

pictures. The difference signal, the prediction error, is further compressed using the discrete

cosine transform (DCT) to remove spatial correlation and is then quantized. Finally, the motion

vectors are combined with the DCT information, and coded using variable length codes.

Part 3 specifies a coded representation that can be used for compressing audio sequences -

both mono and stereo. Input audio samples are fed into the encoder. The mapping creates a

filtered and subsampled representation of the input audio stream. A psychoacoustic model

creates a set of data to control the quantiser and coding. The quantiser and coding block creates

13


14/32

a set of coding symbols from the mapped input samples. The block 'frame packing' assembles

the actual bit stream from the output data of the other blocks, and adds other information (e.g.

error correction) if necessary.

Part 4 specifies how tests can be designed to verify whether bitstreams and decoders meet the

requirements as specified in parts 1, 2 and 3 of the MPEG-1 standard. These tests can be used

by:

manufacturers of encoders, and their customers, to verify whether the encoder produces

valid bitstreams.

manufacturers of decoders and their customers to verify whether the decoder meets the

requirements specified in parts 1,2 and 3 of the standard for the claimed decoder

capabilities. applications to verify whether the characteristics of a given bitstream meet the

application requirements, for example whether the size of the coded picture does not

exceed the maximum value allowed for the application.

2.1.5 MPEG-2

MPEG-2 is an extension of the MPEG-1 international standard for digital compression of audio

and video signals. MPEG-2 is directed at broadcast formats at higher data rates; it provides

extra algorithmic 'tools' for efficiently coding interlaced video, supports a wide range of bit

rates and provides for multichannel surround sound coding.

1. INTRODUCTION:

The MPEG-2 standard [2] is capable of coding standard-definition television at bit rates from

about 3-15 Mbit/s and high-definition television at 15-30 Mbit/s. MPEG-2 extends the stereo

audio capabilities of MPEG-1 to multi-channel surround sound coding. MPEG-2 decoders will

also decode MPEG-1 bit streams.

2. VIDEO FUNDAMENTALS

14


15/32

Television services in Europe currently broadcast video at a frame rate of 25 Hz. Each frame

consists of two interlaced fields, giving a field rate of 50 Hz. The first field of each frame

contains only the odd numbered lines of the frame (numbering the top frame line as line 1).

The second field contains only the even numbered lines of the frame and is sampled in the

video camera 20 ms after the first field. It is important to note that one interlaced frame

contains fields from two instants in time. American television is similarly interlaced but with a

frame rate of just less than 30 Hz.

The red, green and blue (RGB) signals coming from a color television camera can be

equivalently expressed as luminance (Y) and chrominance (UV) components. The

chrominance bandwidth may be reduced relative to the luminance without significantly

affecting the picture quality. For standard definition video, CCIR recommendation 601 [3]defines how the component (YUV) video signals can be sampled and digitized to form discrete

pixels . The terms 4:2:2 and 4:2:0 are often used to describe the sampling structure of the

digital picture. 4:2:2 means the chrominance is horizontally sub sampled by a factor of two

relative to the luminance; 4:2:0 means the chrominance is horizontally and vertically sub

sampled by a factor of two relative to the luminance.

3. BIT RATE REDUCTION PRINCIPLES : A bit rate reduction system operates by

removing redundant information from the signal at the coder prior to transmission and re-inserting it at the decoder. A coder and decoder pair is referred to as a 'codec'. In video signals,

two distinct kinds of redundancy can be identified.

Spatial and temporal redundancy: Pixel values are not independent, but are correlated with

their neighbors both within the same frame and across frames. So, to some extent, the value of

a pixel is predictable given the values of neighboring pixels.

Psycho visual redundancy:

The human eye has a limited response to fine spatial detail [4], and is less sensitive to detail

near object edges or around shot-changes. Consequently, controlled impairments introduced

into the decoded picture by the bit rate reduction process should not be visible to a human

observer.

15


16/32

Two key techniques employed in an MPEG codec are intra-frame Discrete Cosine Transform

(DCT) coding and motion-compensated inter-frame prediction. These techniques have been

successfully applied to video bit rate reduction prior to MPEG, notably for 625-line video

contribution standards at 34 Mbit/s and video conference systems at bit rates below 2 Mbit/s.

4. MPEG-2 DETAILS

In an MPEG-2 system, the DCT and motion-compensated interframe prediction are combined.

The coder subtracts the motion-compensated prediction from the source picture to form a

'prediction error' picture. The prediction error is transformed with the DCT, the coefficients are

quantized and these quantized values coded using a VLC. The coded luminance and

chrominance prediction error is combined with 'side information' required by the decoder, such

as motion vectors and synchronizing information, and formed into a bit stream for

transmission. In the decoder, the quantized DCT coefficients are reconstructed and inverse

transformed to produce the prediction error. This is added to the motion-compensated

prediction generated from previously decoded pictures to produce the decoded output.

Picture types

In MPEG-2, three 'picture types' are defined. The picture type defines which prediction modes

may be used to code each block..

'Intra' pictures (I-pictures) are coded without reference to other pictures. Moderate compression

is achieved by reducing spatial redundancy, but not temporal redundancy. They can be used

periodically to provide access points in the bitstream where decoding can begin.

'Predictive' pictures (P-pictures) can use the previous I- or P-picture for motion compensation

and may be used as a reference for further prediction. Each block in a P-picture can either be

predicted or intra-coded. By reducing spatial and temporal redundancy, P-pictures offer

increased compression compared to I-pictures.

16


17/32

'Bidirectionally-predictive' pictures (B-pictures) can use the previous and next I- or P-pictures

for motion-compensation, and offer the highest degree of compression. Each block in a B-

picture can be forward, backward or bidirectionally predicted or intra-coded. To enable

backward prediction from a future frame, the coder reorders the pictures from natural 'display'

order to 'bitstream' order so that the B-picture is transmitted after the previous and next pictures

it references. This introduces a reordering delay dependent on the number of consecutive B-

pictures.

Buffer control : By removing much of the redundancy from the source images, the coder

outputs a variable bit rate. The bit rate depends on the complexity and predictability of the

source picture and the effectiveness of the motion-compensated prediction.

MPEG-2 is a standard currently in 6 parts :

Part 1 of MPEG-2 addresses the combining of one or more elementary streams of video and

audio, as well as, other data into single or multiple streams which are suitable for storage or

transmission. This is specified in two forms: the Program Stream and the Transport Stream.

Each is optimized for a different set of applications. The Program Stream is similar to MPEG-1

Systems Multiplex. It results from combining one or more Packetised Elementary Streams

(PES), which have a common time base, into a single stream. The Program Stream is designed

for use in relatively error-free environments and is suitable for applications which may involve

software processing. Program stream packets may be of variable and relatively great length.

The Transport Stream combines one or more Packetized Elementary Streams (PES) with one

or more independent time bases into a single stream. Elementary streams sharing a common

timebase form a program. The Transport Stream is designed for use in environments whereerrors are likely, such as storage or transmission in lossy or noisy media. Transport stream

packets are 188 bytes long.

17


18/32

Part 2 of MPEG-2 builds on the powerful video compression capabilities of the MPEG-1

standard to offer a wide range of coding tools. These have been grouped in profiles to offer

different functionalities. Since the final approval of MPEG-2 Video in November 1994, one

additional profile has been developed. This uses existing coding tools of MPEG-2 Video but is

capable to deal with pictures having a colour resolution of 4:2:2 and a higher bitrate. Even

though MPEG-2 Video was not developed having in mind studio applications, a set of

comparison tests carried out by MPEG confirmed that MPEG-2 Video was at least good, and

in many cases even better than standards or specifications developed for high bitrate or studio

applications.

The Multiview Profile (MVP) is an additional profile currently being developed. By using

existing MPEG-2 Video coding tools it is possible to encode in an efficient way tow videosequences issued from two cameras shooting the same scene with a small angle between them.

Part 3 of MPEG-2 - Digital Storage Media Command and Control (DSM-CC) is the

specification of a set of protocols which provides the control functions and operations specific

to managing MPEG-1 and MPEG-2 bitstreams. These protocols may be used to support

applications in both stand-alone and heterogeneous network environments. In the DSM-CCmodel, a stream is sourced by a Server and delivered to a Client. Both the Server and the Client

are considered to be Users of the DSM-CC network. DSM-CC defines a logical entity called

the Session and Resource Manager (SRM) which provides a (logically) centralized

management of the DSM-CC Sessions and Resources.

Part 4of MPEG-2 will be the specification of a multichannel audio coding algorithm not

constrained to be backwards-compatible with MPEG-1 Audio. The standard has been approved

in April 1997.

18


19/32

Part 5 of MPEG-2 was originally planned to be coding of video when input samples are 10

bits. Work on this part was discontinued when it became apparent that there was insufficient

interest from industry for such a standard.

Part 6 of MPEG-2 is the specification of the Real-time Interface (RTI) to Transport Stream

decoders which may be utilised for adaptation to all appropriate networks carrying Transport

Streams.

2.1.6 MPEG-4

Introduction

The creation of the MPEG-4 specification arose as experts wanted a faster compression rate

than MPEG-2, but which also worked well at low bit rates. Discussions began at the end of

1992 and work on the standards started in July 1993.

MPEG-4 provides a standardized method of:

1 1. Audio-visual coding at very low bit rates

2 2. Describing audio-visual objects in a scene.3 3. Multiplexing and synchronizing the information associated with the objects

4 4. Interacting with the audio-visual scene that is received by the end user.

Elementary Streams:

Each encoded media object has its own Elementary Stream (ES), which is sent to the decoder

and decoded individually, before composition. The following streams are created in

MPEG-4:

1 Scene Description Stream

2 Object Description Stream

3 Visual Stream

19


20/32

4 Audio Stream

When data has been encoded, the data streams can be transmitted or stored separately and need

to be composed at the receiving end. Media objects are organized in a hierarchical manner to

form audio-visual scenes. Due to this organizational manner, the media objects, each object

can be described or encoded independently of other objects in the scene e.g. the background.

MPEG-4/BiFS:

1 Allows users to change their view point in a 3D scene or to interact with media objects.

Allows different objects in the same scene to be coded at different levels of quality.

MPEG-4 Systems also addresses:

1 1. A standard file format to enable the exchange and authoring of MPEG-4 content

2 2.Interactivity (both client-side and server-side

3 3.MPEG-J (MPEG-4 & Java)

4 4. FlexMux tool which allows for the interleaving of multiple streams into a single

stream.

Profiles have been developed to create conformance points for MPEG-4 tools and toolsets,

therefore interoperability of MPEG-4 products with the same Profiles and Levels can be

assured.

A Profile is a subset of the MPEG-4 Systems, Visual or Audio tools set and is used for specific

applications. It limits the tool set a decoder has to implement, therefore many applications only

need a portion of the MPEG-4 toolset. Profiles specified in the MPEG-4 standard include:

1 a. Visual Profile

2 b. Natural Profile

3 c. Synthetic & Natural/Synthetic Hybrid Profiles

4 d. Audio Profile

5 e. Graphic Profile

6 f. Scene Graph Profile

20


21/32

The systems part of the MPEG-4 addresses the description of the relationship between the

audio-visual components that constitute a scene. The relationship is described at two main

levels.

The Binary Format for Scenes (BIFS) describes the spatio-temporal arrangements of

the objects in the scene. Viewers may have the possibility of interacting with the

objects, e.g. by rearranging them on the scene or by changing their own point of view in

a 3D virtual environment. The scene description provides a rich set of nodes for 2-D

and 3-D composition operators and graphics primitives.

At a lower level, Object Descriptors (ODs) define the relationship between the

Elementary Streams pertinent to each object (e.g. the audio and the video stream of a

participant to a videoconference) ODs also provide additional information.

7

2.1.7 MPEG-7

Introduction

The MPEG standards are an evolving set of standards for video and audio compression. MPEG

7 technology covers the most recent developments in multimedia search and retreival, designed

to standardize the description of multimedia content supporting a wide range of applications

including DVD, CD and HDTV.

MPEG-7 is a seven-part specification, formally entitled Multimedia Content Description

Interface. It provides standardized tools for describing multimedia content, which will enable

searching, filtering and browsing of multimedia content.

ISO 15938-1 Systems

MPEG-7 descriptions exist in two formats:

21


22/32

Textual - XML which allows editing, searching and filtering of a multimedia description. The

description can be located anywhere, not necessaryily with the content. Binary format -

suitable for storing, transmitting and streaming delivery of the multimedia description.

The MPEG-7 Systems provides the tools for:

1

a. The preparation of binary coded representation of the PEG-7 descriptions,

b. For efficient storage and transmission.

c. Transmission techniques (both textual and binary formats)

d. Multiplexing of descriptions

e. Synchronization of descriptions with content

f. Intellectual property management and protectiong. Terminal architecture

h. Normative interface

Descriptions may represented in two forms:

Textual (XML)

Binary form (BiM Binary format for Metadata). Binary coded representation is

useful for efficient storage and transmission of content.

1 MPEG-7 data is obtained from transport or storage ,handed to delivery layer. This

allows extraction of elementary streams (consisting of individually accessible chunks called

access units) by undoing the transport/storage specific framing and multiplexing and retains

timing information needed for synchronisation.

2

3 Elementary streams are forwarded to the compression layer where the schema streams

(schemes describing strucure of MPEG-7 data) and partial or full description streams (streams

describing the content) are decoded.

22


23/32

MPEG-7 tools

MPEG-7 uses the following tools:

Descriptor (D): It is a representation of a feature defined syntactically andsemantically. It could be that a unique object was described by several descriptors.

Description Schemes (DS): Specify the structure and semantics of the relations

between its components, these components can be descriptors (D) or description

schemes (DS).

Description Definition Language (DDL): It is based on XML language used to define

the structural relations between descriptors. It allows the creation and modification of description schemes and also the creation of new descriptors (D).

System tools: These tools deal with binarization, synchronization, transport and storage

of descriptors.

2.1.8 H.261

H.261 is a ITU-T video coding standard, ratified in November 1988. Originally designed for

transmission over ISDN lines on which data rates are multiples of 64 kbit/s. It is one member

of the H.26x family of video coding standards in the domain of the ITU-T Video Coding

Experts Group (VCEG). The coding algorithm was designed to be able to operate at video bit

rates between 40 kbit/s and 2 Mbit/s. The standard supports two video frame sizes: CIF

(352x288 luma with 176x144 chroma) and QCIF (176x144 with 88x72 chroma) using a 4:2:0

sampling scheme. It also has a backward-compatible trick for sending still picture graphics

with 704x576 luma resolution and 352x288 chroma resolution (which was added in a later

revision in 1993.

Different steps follow by this standard is:

23


24/32

1. Loop filter

The prediction process may be modified by a two-dimensional spatial filter (FIL) which

operates on pixels within a predicted 8 by 8 block..The filter is separable into one-dimensional

horizontal and vertical functions. Both are non-recursive with coefficients of 1/4, 1/2, 1/4

except at block edges where one of the taps would fall outside the block. In such cases the 1-D

filter is changed to have coefficients of 0, 1, 0. Full arithmetic precision is retained with

rounding to 8 bit integer values at the 2-D filter output. Values whose fractional part is one half

are rounded up. The filter is switched on/off for all six blocks in a macro block according to the

macro block type.

2. Transformer

Transmitted blocks are first processed by a separable two-dimensional discrete cosine

transform of size 8 by 8. The output from the inverse transform ranges from 256 to +255 after

clipping to be represented with 9 bits. The transfer function of the inverse transform is given

by:

NOTE Within the block being transformed, x = 0 and y = 0 refer to the pel nearest the left

and top edges of the picture, respectively.

The arithmetic procedures for computing the transforms are not defined, but the inverse one

should meet the error Tolerance.

3. Quantization

The number of quantizers is 1 for the INTRA dc coefficient and 31 for all other coefficients.

Within a macro block the same quantizer is used for all coefficients except the INTRA dc one.

The decision levels are not defined. The INTRA dc coefficient is nominally the transform

value linearly quantized with a step size of 8 and no dead-zone. Each of the other 31 quantizers

24


25/32

is also nominally linear but with a central dead-zone around zero and with a step size of an

even value in the range 2 to 62.

Clipping of reconstructed picture

To prevent quantization distortion of transform coefficient amplitudes causing arithmetic

overflow in the encoder and decoder loops, clipping functions are inserted. The clipping

function is applied to the reconstructed picture which is formed by summing the prediction and

the prediction error as modified by the coding process. This clipper operates on resulting pel

values less than 0 or greater than 255, changing them to 0 and 255, respectively.

Different Tools have been developed already:

Video compressor AVI Compressor

MP4 Compressor MPEG Compressor

3GP Compressor You Tube Compressor

iPod Compressor Flash Video Compressor

Quick Time Compressor

WMV Compressor

MKV Compressor

VOB Compressor DVD Compressor

25


26/32

2.2 Methodology Adopted

H.261

H.261 is a ITU-T video coding standard, ratified in November 1988. Originally designed for

transmission over ISDN lines on which data rates are multiples of 64 kbit/s. It is one member

of the H.26x family of video coding standards in the domain of the ITU-T Video Coding

Experts Group (VCEG). The coding algorithm was designed to be able to operate at video bit

rates between 40 kbit/s and 2 Mbit/s. The standard supports two video frame sizes: CIF(352x288 luma with 176x144 chroma) and QCIF (176x144 with 88x72 chroma) using a 4:2:0

sampling scheme. It also has a backward-compatible trick for sending still picture graphics

with 704x576 luma resolution and 352x288 chroma resolution (which was added in a later

revision in 1993.

Different steps follow by this standard is:

1.Loop filterThe prediction process may be modified by a two-dimensional spatial filter (FIL) which

operates on pixels within a predicted 8 by 8 block..The filter is separable into one-dimensional

horizontal and vertical functions. Both are non-recursive with coefficients of 1/4, 1/2, 1/4

except at block edges where one of the taps would fall outside the block. In such cases the 1-D

filter is changed to have coefficients of 0, 1, 0. Full arithmetic precision is retained with

rounding to 8 bit integer values at the 2-D filter output. Values whose fractional part is one half

are rounded up. The filter is switched on/off for all six blocks in a macro block according to the

macro block type.

26


27/32

2.Transformer

Transmitted blocks are first processed by a separable two-dimensional discrete cosine

transform of size 8 by 8. The output from the inverse transform ranges from 256 to +255 after

clipping to be represented with 9 bits. The transfer function of the inverse transform is given

by:

NOTE Within the block being transformed, x = 0 and y = 0 refer to the pel nearest the left

and top edges of the picture, respectively.

The arithmetic procedures for computing the transforms are not defined, but the inverse one

should meet the errorTolerance.

3. Quantization

The number of quantizers is 1 for the INTRA dc coefficient and 31 for all other coefficients.

Within a macro block the same quantizer is used for all coefficients except the INTRA dc one.

The decision levels are not defined. The INTRA dc coefficient is nominally the transform

value linearly quantized with a step size of 8 and no dead-zone. Each of the other 31 quantizers

is also nominally linear but with a central dead-zone around zero and with a step size of an

even value in the range 2 to 62.

Clipping of reconstructed picture

To prevent quantization distortion of transform coefficient amplitudes causing arithmetic

overflow in the encoder and decoder loops, clipping functions are inserted. The clipping

function is applied to the reconstructed picture which is formed by summing the prediction and

the prediction error as modified by the coding process. This clipper operates on resulting pel

values less than 0 or greater than 255, changing them to 0 and 255, respectively

27


28/32

Chapter -3 Project Estimation and Implementation Plan3.1 Cost and Benefit Analysis

3.1.1 ECONOMICAL

Economic analysis is the most frequently used method for evaluating the candidate system.

More commonly known as cost of Benefit Analysis, the procedure is to determine the

benefits and savings that are expected from the candidate system and compare them with the

costs. If benefit outweighs the cost then the decision is made to design and implementation

otherwise further justification or alterations are made in the proposed system.

This project doesn't have many hardware requirements, thus, it requires less costing to install

the software on the whole.

Though, from the point of economy, the manual handling of the hardware component is

much cheaper and best as compared to computerized systems. This approach normally

works very well in any ordinary organization . The major problem starts when the no. of

hardware components are starts growing with a time. Manual system needs variousregisters/books to maintain the daily complain entry, hardware entry done. In case of any

misplacement of hardware component, the concerned registers have to be searched for the

verification of identifying the status of that component . It is very cumbersome job to

maintain all these manually. So it is very easy to maintain all these in the proposed system.

3.1.2 COST ANALYSIS

The cost to conduct investigation was negligible, as the center manager

and teachers of center provided most of information.

The cost of essential hardware and software requirement is not very

expensive.

28


29/32

Moreover hardware like Pentium Core PC and software like MATLAB

are easily available in the market.

3.1.3 BENEFITS AND SAVINGS-

Cost of the maintenance of the proposed system is negligible. Money is saved as paper work is minimized.

Records are easily entered and retrieved.

Time is saved as all the work can be done by a simple mouse click.

The proposed system is fully automated and hence easy to use. Since benefits out base the cost, hence our project is economically feasible.

29


30/32

3.2 Schedule Estimate

This is the table of Activity and its estimated time duration, which are used to accomplish the

project.

Activity Completion

Date

Duration

(In Days)

Effort

(in Manhours)A) Introduction 20 AUG 2010 20 250B) Problem Analysis 15 SEP 2010 25 370C) Project Estimation &

Implementation Plan

20 OCT 2010 35 520

D) Research Design 10 NOV 2010 20 300E) System Interface Design 10 DEC 2010 30 450

F) Coding 20 FEB 2011 30 600G) Experiments Specification 10 MAR 2011 20 300H) Conclusions 25 MAR 2011 15 20I) User Manual 10 APR 2011 15 20

3.3 Gantt Chart

A Gantt chart is a horizontal bar chart developed as a production control tool in 1917 by HenryL. Gantt, an American engineer and social scientist. Frequently used in project management, a

Gantt chart provides a graphical illustration of a schedule that helps to plan, coordinate, andtrack specific tasks in a project.

Gantt charts may be simple versions created on graph paper or more complex automatedversions created using project management applications such as Microsoft Project or Excel.

A Gantt chart is constructed with a horizontal axis representing the total time span of the project, broken down into increments (for example, days, weeks, or months) and a vertical axis

30


31/32

representing the tasks that make up the project (for example, if the project is outfitting your computer with new software, the major tasks involved might be: conduct research, choosesoftware, install software). Horizontal bars of varying lengths represent the sequences, timing,and time span for each task. Using the same example, you would put "conduct research" at thetop of the vertical axis and draw a bar on the graph that represents the amount of time you

expect to spend on the research, and then enter the other tasks below the first one andrepresentative bars at the points in time when you expect to undertake them. The bar spans mayoverlap, as, for example, you may conduct research and choose software during the same timespan. As the project progresses, secondary bars, arrowheads, or darkened bars may be added toindicate completed tasks, or the portions of tasks that have been completed. A vertical line isused to represent the report date.

Scheduling of SDLC (GANNT CHART)

In Weeks

0 5 10 15 20 25 30 35

Analysis

Documentation

Design.

Coding

Testing

References

31


32/32

[1] HUFFMAN, D. A. (1951). A method for the construction of minimum redundancy codes.

In the

Proceedings of the Institute of Radio Engineers 40, pp. 1098-1101.

[2] CAPON, J. (1959). A probabilistie model for run-length coding of pictures. IRE Trans. On

Information

Theory, IT-5, (4), pp. 157-163.

[3] APOSTOLOPOULOS, J. G. (2004). Video Compression. Streaming Media Systems

Group.

[4] The Moving Picture Experts Group home page. (3. Feb. 2006)

[5] CLARKE, R. J. Digital compression of still images and video. London: Academic press.

1995, pp.

[6] http://www.irf.uka.de/seminare/redundanz/vortrag15/.(3. Feb. 2006)

[7] PEREIRA, F. The MPEG4 Standard: Evolution or Revolution

[8] MANNING, C. The digital video site.

[9] SEFERIDIS, V. E. GHANBARI, M. (1993). General approach to block-matching motion

estimation.

Optical Engineering, (32), pp. 1464-1474.

[10] GHARAVI, H. MILLIS, M. (1990). Block matching motion estimation algorithms-new

results. IEEE

Transactions on Circuits and Systems, (37), pp. 649-651.

[11] CHOI, W. Y. PARK R. H. (1989). Motion vector coding with conditional transmission.

Signal

Processing, (18). pp. 259-267.

[12] Institut fr Informatik Universitt Karlsruhe .

Final Project 112

Documents

Transcript of Final Project 112