Final Project 112

download Final Project 112

of 32

Transcript of Final Project 112

  • 8/8/2019 Final Project 112

    1/32

    Project Report

    ON

    VIDEO COMPRESSION

    Submitted in partial fulfillment of the requirement for the degreeof

    B. Tech. in Computer Science & Engineering.

    Under Guidance of:Mr. Roshan Singh(Assistant Professor)

    MANAV RACHNA COLLEGE OF ENGINEERING

    FARIDABAD

    BATCH (2007-2011)

    1

    Submitted By:Amit Saini (Roll No.)

    Raj Kamal Sharma (Roll No.)

    Sandeep Yadav (Roll No.)

  • 8/8/2019 Final Project 112

    2/32

    Table Of Content

    Chapter 1. Introduction

    1.1 Objective(s) of the System/Tool 1

    1.2 Scope of the System/Tool 2

    1.3 Problem definition of the System/Tool ,3

    1.4 Hardware and Software Requirements 3.

    Chapter 2. Problem Analysis

    2.1 Literature Survey .4

    2.1.1 Introduction.........................................................................................4

    2.1.2 VIDEO COMPRESSION TECHNOLOGY...............................5

    2.1.3.COMPRESSION STANDARDS................................6

    2.1.4.MPEG-1...............................................................................................9

    2.1.5.MPEG-2..............................................................................................12

    2.1.6.MPEG-4..............................................................................................17

    2.1.7.MPEG-7..............................................................................................19

    2.1.8.H.261....................................................................................................21

    2.2 Methodology Adopted ..24

    Chapter 3. Project Estimation and Implementation Plan

    3.1 Cost and Benefit Analysis 26

    3.2. Schedule Estimate 28

    3.3 PERT Chart/ Gantt Chart 28

    Reference s30

    2

  • 8/8/2019 Final Project 112

    3/32

    Chapter -1 Introduction

    1.1. OBJECTIVE

    1.1.1. NEED OF THE SYSTEM: Uncompressedxc video (and audio) data are huge. In

    HDTV, the bit rate easily exceeds 1 Gbps. -- big problems for storage and network

    communications. For example: One of the formats defined for HDTV broadcasting

    within the United States is 1920 pixels horizontally by 1080 lines vertically, at 30

    frames per second. If these numbers are all multiplied together, along with 8 bits for

    each of the three primary colors, the total data rate required would be approximately 1.5

    Gb/sec. Because of the 6 MHz. channel bandwidth allocated, each channel will only

    support a data rate of 19.2 Mb/sec, which is further reduced to 18 Mb/sec by the fact

    that the channel must also support audio, transport, and ancillary data information. As

    can be seen, this restriction in data rate means that the original signal must be

    compressed by a figure of approximately 83:1. This number seems all the more

    impressive when it is realized that the intent is to deliver very high quality video to the

    end user, with as few visible artifacts as possible.

    1.1.2. OUTCOME OF THE SYSTEM:

    Video Compressor is a multifunctional video compression software to help you compress

    video files to smaller file size. With comprehensive video formats supported, plentiful profiles

    and handy tools provided, this Video Compressor is the ideal video file compressor and video

    size compressor .

    A digital video compression system and its methods for compressing digitalized video signals

    in real time. The system compressor receives digitalized video frames divided into subframes,

    performs in a single pass a spatial domain to transform domain transformation in two

    dimensions of the picture elements of each subframe, normalizes the resultant coefficients by a

    normalization factor having a predetermined compression ratio component and an adaptive

    rate buffer capacity control feedback component, to provide compression, encodes the

    coefficients and stores them in a first rate buffer memory asynchronously at a high data transfer

    3

  • 8/8/2019 Final Project 112

    4/32

    rate from which they are put out at a slower, synchronous rate. The compressor adaptively

    determines the rate buffer capacity control feedback component in relation to instantaneous

    data content of the rate buffer memory in relation to its capacity, and it controls the absolute

    quantity of data resulting from the normalization step so that the buffer memory is never

    completely emptied and never completely filled. In expansion, the system essentially mirrors

    the steps performed during compression. An efficient, high speed decoder forms an important

    aspect of the present invention. The compression system forms an important element of a

    disclosed color broadcast compression system.

    Scope of the system / Tool

    What will the tool be able to do?

    The system tool is able into these topics:

    Desktop Tools and Development Environment Startup and shutdown, arranging thedesktop, and using tools to become more productive with MATLAB

    Data Import and Export Retrieving and storing data, memory-mapping, andaccessing Internet files

    Mathematics Mathematical operationsData Analysis Data analysis, including data fitting, Fourier analysis, and time-series

    toolsProgramming Fundamentals The MATLAB language and how to develop

    MATLAB applicationsObject-Oriented Programming Designing and implementing MATLAB classesGraphics Tools and techniques for plotting, graph annotation, printing, and

    programming with Handle Graphics objects3-D Visualization Visualizing surface and volume data, transparency, and viewing

    and lighting techniquesCreating Graphical User Interfaces GUI-building tools and how to write callback

    functionsExternal Interfaces MEX-files, the MATLAB engine, and interfacing to Sun

    Microsystems Java software, Microsoft .NET Framework, COM, Web services, and theserial port

    There is reference documentation for all MATLAB functions:

    Function Reference Lists all MATLAB functions, listed in categories or alphabetically

    Handle Graphics Property Browser Provides easy access to descriptions of graphicsobject properties

    4

  • 8/8/2019 Final Project 112

    5/32

    C/C++ and Fortran API Reference Covers functions used by the MATLAB externalinterfaces, providing information on syntax in the calling language, description, arguments,return values, and examples

    The MATLAB application can read data in various file formats, discussed in the followingsections:

    Recommended Methods for Importing DataImporting MAT-FilesImporting Text Data FilesImporting XML DocumentsImporting Excel SpreadsheetsImporting Scientific Data FilesImporting ImagesImporting Audio and VideoImporting Binary Data with Low-Level I/O

    1.4 Hardware And Software Requirements:1.4.1 HARDWARE REQUIREMENTS:

    512 MB RAM

    10 GB HARD DISK

    1.4.2 SOFTWARE REQUIREMENTS :

    1. OPERATING SYSTEM WINDOWs or LINUX

    2. MATLAB

    5

    http://www.mathworks.com/help/techdoc/import_export/braietb-1.htmlhttp://www.mathworks.com/help/techdoc/import_export/f5-86568.htmlhttp://www.mathworks.com/help/techdoc/import_export/f5-124245.htmlhttp://www.mathworks.com/help/techdoc/import_export/f5-132080.htmlhttp://www.mathworks.com/help/techdoc/import_export/f5-6031.htmlhttp://www.mathworks.com/help/techdoc/import_export/f5-86568.htmlhttp://www.mathworks.com/help/techdoc/import_export/f5-124245.htmlhttp://www.mathworks.com/help/techdoc/import_export/f5-132080.htmlhttp://www.mathworks.com/help/techdoc/import_export/f5-6031.htmlhttp://www.mathworks.com/help/techdoc/import_export/braietb-1.html
  • 8/8/2019 Final Project 112

    6/32

    Chapter-2 Problem Analysis

    2.1 Literature Survey

    2.1.1 Introduction (Background Work):

    Video compression typically operates on square-shaped groups of neighboring pixels, often

    called macro blocks. These pixel groups or blocks of pixels are compared from one frame to

    the next and the video compression codec (encode/decode scheme) sends only the difference

    within those blocks. This works extremely well if the video has no motion. A still frame of

    text, for example, can be repeated with very little transmitted data. In areas of video with more

    motion, more pixels change from one frame to the next. When more pixels change, the video

    compression scheme must send more data to keep up with the larger number of pixels that are

    changing. If the video content includes an explosion, flames, a flock of thousands of birds, or

    any other image with a great deal of high-frequency detail, the quality will decrease, or the

    variable bit rate must be increased to render this added information with the same level of

    detail.

    Video is basically a three-dimensional array of color pixels. Two dimensions serve as spatial

    (horizontal and vertical) directions of the moving pictures, and one dimension represents the

    time domain. A data frame is a set of all pixels that correspond to a single time moment.Basically, a frame is the same as a still picture.

    Some forms of data compression are lossless. This means that when the data is decompressed,

    the result is a bit-for-bit perfect match with the original. While lossless compression of video is

    possible, it is rarely used, as lossy compression results in far higher compression ratios at an

    acceptable level of quality.

    One of the most powerful techniques for compressing video is interframe compression.

    Interframe compression uses one or more earlier or later frames in a sequence to compress the

    current frame, while intraframe compression uses only the current frame, which is effectively

    image compression.

    The most commonly used method works by comparing each frame in the video with the

    previous one. If the frame contains areas where nothing has moved, the system simply issues a

    6

  • 8/8/2019 Final Project 112

    7/32

  • 8/8/2019 Final Project 112

    8/32

    other methods (DCT) that work on smaller pieces of the desired data. The result is a

    hierarchical representation of an image, where each layer represents a frequency band.

    2.1.3 Compression Standards (Techniques for solving the problem):

    MPEG stands for the Moving picture Expert Group.MPEG is an ISO/IEC working group,

    established in 1988 to develop standards for digital audio and video formats. There are four

    MPEG standards being used or in development. Each compression standard was designed with

    a specific application and bit rate in mind, although MPEG compression scales well with

    increased bit rates. They include:

    2.1.3.1 MPEG-1

    Designed for up to 1.5 Mbit/sec Standard for the compression of moving pictures and audio.

    This was based on CD-ROM video applications, and is a popular standard for video on the

    Internet, transmitted as .mpg files. In addition, level 3 of MPEG-1 is the most popular standard

    for digital compression of audio--known as MP3. MPEG-1 is the standard of compression for

    VideoCD, the most popular video distribution format thoughout much of Asia.

    2.1.3.2 MPEG-2

    Designed for between 1.5 and 15Mbit/sec Standard on which Digital Television set top boxes

    and DVD compression is based. It is based on MPEG-1, but designed for the compression and

    transmission of digital broadcast television. The most significant enhancement from MPEG-1

    is its ability to efficiently compress interlaced video. MPEG-2 scales well to HDTV resolution

    and bit rates, obviating the need for an MPEG-3.

    2.1.3.4 MPEG-4

    Standard for multimedia and Web compression. MPEG-4 is based on object-based

    compression, similar in nature to the Virtual Reality Modeling Language. Individual objects

    within a scene are tracked separately and compressed together to create an MPEG4 file. This

    results in very efficient compression that is very scalable, from low bit rates to very high. It

    8

  • 8/8/2019 Final Project 112

    9/32

    also allows developers to control objects independently in a scene, and therefore introduce

    interactivity.

    2.1.3.5 MPEG7- this standard, currently under development, is also called the Multimedia

    Content Description Interface. When released, the group hopes the standard will provide a

    framework for multimedia content that will include information on content manipulation,

    filtering and personalization, as well as the integrity and security of the content. Contrary to the

    previous MPEG standards, which described actual content, MPEG-7 will represent information

    about the content.

    2.1.3.6 H.261:- H.261 is an ITU standard designed for two-way communication over ISDN

    lines (video conferencing) and supports data rates which are multiples of 64Kbit/s. The

    algorithm is based on DCT and can be implemented in hardware or software and uses

    intraframe and interframe compression. H.261 supports CIF and QCIF resolutions.

    MPEG4 advantages include high compression, low bit rate and motion compensation support.

    Disadvantages are latency and blocking artifacts. JPEG, JPEG2000, and MPEG4 have all been

    used in video surveillance systems, with the choice depending on what is most important in

    that particular application. H.264 is an advanced compression scheme which is also starting to

    find its way into video surveillance systems. H.264 offers compression at the expense of additional hardware complexity. It is not examined this paper, but FPGA-based solutions for

    H.264.

    2.1.3.7 MPEG21:

    The MPEG-21 standard, from the Moving Picture Expert Group, aims at defining an open

    framework for multimedia applications. MPEG-21 is ratified in the standards ISO/IEC 21000 -

    Multimedia framework (MPEG-21).

    MPEG-21 is based on two essential concepts:

    definition of a Digital Item (a fundamental unit of distribution and transaction)

    users interacting with Digital Items

    9

  • 8/8/2019 Final Project 112

    10/32

    Digital Items can be considered the kernel of the Multimedia Framework and the users can be

    considered as who interacts with them inside the Multimedia Framework. At its most basic

    level, MPEG-21 provides a framework in which one user interacts with another one, and the

    object of that interaction is a Digital Item. Due to that, we could say that the main objective of

    the MPEG-21 is to define the technology needed to support users to exchange, access,

    consume, trade or manipulate Digital Items in an efficient and transparent way.

    The storage of an MPEG-21 Digital Item in a file format based on the ISO base media file

    format, with some or all of Digital Item's ancillary data (such as movies, images or other non-

    XML data) within the same file.

    2.1.3.8 H.263:

    H.263 is a video compression standard originally designed as a low-bit rate compressed format

    for video conferencing. It was developed by the (VCEG) ITU-T Video Coding Expert Group

    (VCEG).

    H.263 has since found many applications on the internet: much Flash video content (as used

    on sites such as YouTube,Google Video,MySpace, etc.) used to be encoded in Sorenson Spark

    format, though many sites now use VP6 or H.264 encoding.

    H.263 was developed as an evolutionary improvement based on experience from H.261, the

    previous ITU-T standard for video compression, and the MPEG-1 and MPEG-2 standards. Its

    first version was completed in 1995 and provided a suitable replacement for H.261 at all bit

    rates. It was further enhanced in projects known as H.263v2; MPEG-4 Part 2 is H.263

    compatible in the sense that a basic H.263 bit stream is correctly decoded by an MPEG-4

    Video decoder.

    2.1.3.9 H.264:

    10

  • 8/8/2019 Final Project 112

    11/32

    The next enhanced codec developed by ITU-T VCEG (in partnership with MPEG) after H.263

    is the H.264 standard, also known as AVC and MPEG-4 part 10. As H.264 provides a

    significant improvement in capability beyond H.263, the H.263 standard is now considered a

    legacy design. Most new videoconferencing products now include H.264 as well as H.263 and

    H.261 capabilities. H.264 is used in such applications as players for Blu-ray Discs, videos from

    You Tube and the iTunes Store, web software such as the Adobe Flash Player and Microsoft

    Silverlight, broadcast services for DVB and SBTVD, direct-broadcast satellite television

    services, cable television services, and real-time videoconferencing.

    2.1.4 MPEG-1

    The Coding of Moving Pictures and Associated Audio for Digital Storage Media at up to

    about 1,5 Mbit/s (ISO/IEC 11172) or MPEG-1 as it is more commonly known as, standardizesthe storage and retrieval of moving pictures and audio storage media forms the basis for Video

    CD and MP3 formats.

    This part of the specification describes the coded representation for the compression of video

    sequences.

    The basic idea of MPEG video compression is to discard any unnecessary information i.e. an

    MPEG-1 encoder by analyses:

    1

    How much movement there is in the current frame compared to the previous frame what

    changes of color have taken place since the last frame what changes in light or contrast have

    taken place since the last frame what elements of the picture have remained static since the last

    frame

    The encoder then looks at each individual pixel to see if movement has taken place, if there has

    been no movement, the encoder stores an instruction to say to repeat the same frame or repeat

    the same frame, but move it to a different position.

    1.I intra-frame

    2.B Bidirectional frames

    3.P Predicted frames

    11

  • 8/8/2019 Final Project 112

    12/32

    Audio, video and time codes are converted into one single stream.

    625 and 525 line

    from 1 to 1.5Mbits/s

    24-30 frames per second

    MPEG-1 compression treats video as a sequence of separate images. Picture Elements, often

    referred to as pixels are elements in the image. Each pixel consists of three components

    Luminance/luminosity (Y) and two for chrominance Cb and Cr. MPEG-1 encodes Y pixels in

    full (check the correct term) resolution as the Human Visual System (HVS) is most sensitive to

    luminance/luminosity.1

    Quantification

    Predictive coding the difference between the predicted pixel value and the real value is coded.

    Movement compensation (MC) predicts the value of a neighboring block of pixels (1 block =

    8x8 pixels) in an image to those of a known block of pixels. A vector describes the 2-

    dimensional movement. If no movement takes place, the value is 0.

    2

    Interframe coding

    Sequential coding

    VLC (Variable? Coding)

    Image Interpolation

    Intra (I frames) are coded independently of other images.

    MPEG codes images progressively Interlaced images need to be converted into a de-interlaced

    format before encoding, Video is encoded and Encoded video is converted into an interlaced

    form.

    To achieve a high compression ratio: An appropriate spatial resolution for the signal is

    chosen/the image is broken down into different pixels; block-based motion compensation is

    used to reduce the temporal redundancy.

    12

  • 8/8/2019 Final Project 112

    13/32

    1 Motion compensation is used for causal prediction of the current picture from a

    previous picture, for non-causal prediction of the current picture from a future picture, or for

    interpolative prediction from past and future pictures.

    The difference signal, the prediction error, is further compressed using the discrete cosine

    transform (DCT) to remove spatial correlation and is then quantized. Finally, the motion

    vectors are combined with the DCT information, and coded using variable length codes.

    MPEG-1 is a standard in 4 parts:

    Part 1 addresses the problem of combining one or more data streams from the video and audio

    parts of the MPEG-1 standard with timing information to form a single stream .This is an

    important function because, once combined into a single stream, the data are in a form wellsuited to digital storage or transmission.

    Part 2 specifies a coded representation that can be used for compressing video sequences -

    both 625-line and 525-lines - to bitrates around 1,5 Mbit/s. Part 2 was developed to operate

    principally from storage media offering a continuous transfer rate of about 1,5 Mbit/s.

    Nevertheless it can be used more widely than this because the approach taken is generic. A

    number of techniques are used to achieve a high compression ratio. The first is to select an

    appropriate spatial resolution for the signal. The algorithm then uses block-based motion

    compensation to reduce the temporal redundancy. Motion compensation is used for causal

    prediction of the current picture from a previous picture, for non-causal prediction of the

    current picture from a future picture, or for interpolative prediction from past and future

    pictures. The difference signal, the prediction error, is further compressed using the discrete

    cosine transform (DCT) to remove spatial correlation and is then quantized. Finally, the motion

    vectors are combined with the DCT information, and coded using variable length codes.

    Part 3 specifies a coded representation that can be used for compressing audio sequences -

    both mono and stereo. Input audio samples are fed into the encoder. The mapping creates a

    filtered and subsampled representation of the input audio stream. A psychoacoustic model

    creates a set of data to control the quantiser and coding. The quantiser and coding block creates

    13

  • 8/8/2019 Final Project 112

    14/32

    a set of coding symbols from the mapped input samples. The block 'frame packing' assembles

    the actual bit stream from the output data of the other blocks, and adds other information (e.g.

    error correction) if necessary.

    Part 4 specifies how tests can be designed to verify whether bitstreams and decoders meet the

    requirements as specified in parts 1, 2 and 3 of the MPEG-1 standard. These tests can be used

    by:

    manufacturers of encoders, and their customers, to verify whether the encoder produces

    valid bitstreams.

    manufacturers of decoders and their customers to verify whether the decoder meets the

    requirements specified in parts 1,2 and 3 of the standard for the claimed decoder

    capabilities. applications to verify whether the characteristics of a given bitstream meet the

    application requirements, for example whether the size of the coded picture does not

    exceed the maximum value allowed for the application.

    2.1.5 MPEG-2

    MPEG-2 is an extension of the MPEG-1 international standard for digital compression of audio

    and video signals. MPEG-2 is directed at broadcast formats at higher data rates; it provides

    extra algorithmic 'tools' for efficiently coding interlaced video, supports a wide range of bit

    rates and provides for multichannel surround sound coding.

    1. INTRODUCTION:

    The MPEG-2 standard [2] is capable of coding standard-definition television at bit rates from

    about 3-15 Mbit/s and high-definition television at 15-30 Mbit/s. MPEG-2 extends the stereo

    audio capabilities of MPEG-1 to multi-channel surround sound coding. MPEG-2 decoders will

    also decode MPEG-1 bit streams.

    2. VIDEO FUNDAMENTALS

    14

  • 8/8/2019 Final Project 112

    15/32

    Television services in Europe currently broadcast video at a frame rate of 25 Hz. Each frame

    consists of two interlaced fields, giving a field rate of 50 Hz. The first field of each frame

    contains only the odd numbered lines of the frame (numbering the top frame line as line 1).

    The second field contains only the even numbered lines of the frame and is sampled in the

    video camera 20 ms after the first field. It is important to note that one interlaced frame

    contains fields from two instants in time. American television is similarly interlaced but with a

    frame rate of just less than 30 Hz.

    The red, green and blue (RGB) signals coming from a color television camera can be

    equivalently expressed as luminance (Y) and chrominance (UV) components. The

    chrominance bandwidth may be reduced relative to the luminance without significantly

    affecting the picture quality. For standard definition video, CCIR recommendation 601 [3]defines how the component (YUV) video signals can be sampled and digitized to form discrete

    pixels . The terms 4:2:2 and 4:2:0 are often used to describe the sampling structure of the

    digital picture. 4:2:2 means the chrominance is horizontally sub sampled by a factor of two

    relative to the luminance; 4:2:0 means the chrominance is horizontally and vertically sub

    sampled by a factor of two relative to the luminance.

    3. BIT RATE REDUCTION PRINCIPLES : A bit rate reduction system operates by

    removing redundant information from the signal at the coder prior to transmission and re-inserting it at the decoder. A coder and decoder pair is referred to as a 'codec'. In video signals,

    two distinct kinds of redundancy can be identified.

    Spatial and temporal redundancy: Pixel values are not independent, but are correlated with

    their neighbors both within the same frame and across frames. So, to some extent, the value of

    a pixel is predictable given the values of neighboring pixels.

    Psycho visual redundancy:

    The human eye has a limited response to fine spatial detail [4], and is less sensitive to detail

    near object edges or around shot-changes. Consequently, controlled impairments introduced

    into the decoded picture by the bit rate reduction process should not be visible to a human

    observer.

    15

  • 8/8/2019 Final Project 112

    16/32

    Two key techniques employed in an MPEG codec are intra-frame Discrete Cosine Transform

    (DCT) coding and motion-compensated inter-frame prediction. These techniques have been

    successfully applied to video bit rate reduction prior to MPEG, notably for 625-line video

    contribution standards at 34 Mbit/s and video conference systems at bit rates below 2 Mbit/s.

    4. MPEG-2 DETAILS

    In an MPEG-2 system, the DCT and motion-compensated interframe prediction are combined.

    The coder subtracts the motion-compensated prediction from the source picture to form a

    'prediction error' picture. The prediction error is transformed with the DCT, the coefficients are

    quantized and these quantized values coded using a VLC. The coded luminance and

    chrominance prediction error is combined with 'side information' required by the decoder, such

    as motion vectors and synchronizing information, and formed into a bit stream for

    transmission. In the decoder, the quantized DCT coefficients are reconstructed and inverse

    transformed to produce the prediction error. This is added to the motion-compensated

    prediction generated from previously decoded pictures to produce the decoded output.

    Picture types

    In MPEG-2, three 'picture types' are defined. The picture type defines which prediction modes

    may be used to code each block..

    'Intra' pictures (I-pictures) are coded without reference to other pictures. Moderate compression

    is achieved by reducing spatial redundancy, but not temporal redundancy. They can be used

    periodically to provide access points in the bitstream where decoding can begin.

    'Predictive' pictures (P-pictures) can use the previous I- or P-picture for motion compensation

    and may be used as a reference for further prediction. Each block in a P-picture can either be

    predicted or intra-coded. By reducing spatial and temporal redundancy, P-pictures offer

    increased compression compared to I-pictures.

    16

  • 8/8/2019 Final Project 112

    17/32

    'Bidirectionally-predictive' pictures (B-pictures) can use the previous and next I- or P-pictures

    for motion-compensation, and offer the highest degree of compression. Each block in a B-

    picture can be forward, backward or bidirectionally predicted or intra-coded. To enable

    backward prediction from a future frame, the coder reorders the pictures from natural 'display'

    order to 'bitstream' order so that the B-picture is transmitted after the previous and next pictures

    it references. This introduces a reordering delay dependent on the number of consecutive B-

    pictures.

    Buffer control : By removing much of the redundancy from the source images, the coder

    outputs a variable bit rate. The bit rate depends on the complexity and predictability of the

    source picture and the effectiveness of the motion-compensated prediction.

    MPEG-2 is a standard currently in 6 parts :

    Part 1 of MPEG-2 addresses the combining of one or more elementary streams of video and

    audio, as well as, other data into single or multiple streams which are suitable for storage or

    transmission. This is specified in two forms: the Program Stream and the Transport Stream.

    Each is optimized for a different set of applications. The Program Stream is similar to MPEG-1

    Systems Multiplex. It results from combining one or more Packetised Elementary Streams

    (PES), which have a common time base, into a single stream. The Program Stream is designed

    for use in relatively error-free environments and is suitable for applications which may involve

    software processing. Program stream packets may be of variable and relatively great length.

    The Transport Stream combines one or more Packetized Elementary Streams (PES) with one

    or more independent time bases into a single stream. Elementary streams sharing a common

    timebase form a program. The Transport Stream is designed for use in environments whereerrors are likely, such as storage or transmission in lossy or noisy media. Transport stream

    packets are 188 bytes long.

    17

  • 8/8/2019 Final Project 112

    18/32

    Part 2 of MPEG-2 builds on the powerful video compression capabilities of the MPEG-1

    standard to offer a wide range of coding tools. These have been grouped in profiles to offer

    different functionalities. Since the final approval of MPEG-2 Video in November 1994, one

    additional profile has been developed. This uses existing coding tools of MPEG-2 Video but is

    capable to deal with pictures having a colour resolution of 4:2:2 and a higher bitrate. Even

    though MPEG-2 Video was not developed having in mind studio applications, a set of

    comparison tests carried out by MPEG confirmed that MPEG-2 Video was at least good, and

    in many cases even better than standards or specifications developed for high bitrate or studio

    applications.

    The Multiview Profile (MVP) is an additional profile currently being developed. By using

    existing MPEG-2 Video coding tools it is possible to encode in an efficient way tow videosequences issued from two cameras shooting the same scene with a small angle between them.

    Part 3 of MPEG-2 - Digital Storage Media Command and Control (DSM-CC) is the

    specification of a set of protocols which provides the control functions and operations specific

    to managing MPEG-1 and MPEG-2 bitstreams. These protocols may be used to support

    applications in both stand-alone and heterogeneous network environments. In the DSM-CCmodel, a stream is sourced by a Server and delivered to a Client. Both the Server and the Client

    are considered to be Users of the DSM-CC network. DSM-CC defines a logical entity called

    the Session and Resource Manager (SRM) which provides a (logically) centralized

    management of the DSM-CC Sessions and Resources.

    Part 4of MPEG-2 will be the specification of a multichannel audio coding algorithm not

    constrained to be backwards-compatible with MPEG-1 Audio. The standard has been approved

    in April 1997.

    18

  • 8/8/2019 Final Project 112

    19/32

    Part 5 of MPEG-2 was originally planned to be coding of video when input samples are 10

    bits. Work on this part was discontinued when it became apparent that there was insufficient

    interest from industry for such a standard.

    Part 6 of MPEG-2 is the specification of the Real-time Interface (RTI) to Transport Stream

    decoders which may be utilised for adaptation to all appropriate networks carrying Transport

    Streams.

    2.1.6 MPEG-4

    Introduction

    The creation of the MPEG-4 specification arose as experts wanted a faster compression rate

    than MPEG-2, but which also worked well at low bit rates. Discussions began at the end of

    1992 and work on the standards started in July 1993.

    MPEG-4 provides a standardized method of:

    1 1. Audio-visual coding at very low bit rates

    2 2. Describing audio-visual objects in a scene.3 3. Multiplexing and synchronizing the information associated with the objects

    4 4. Interacting with the audio-visual scene that is received by the end user.

    Elementary Streams:

    Each encoded media object has its own Elementary Stream (ES), which is sent to the decoder

    and decoded individually, before composition. The following streams are created in

    MPEG-4:

    1 Scene Description Stream

    2 Object Description Stream

    3 Visual Stream

    19

  • 8/8/2019 Final Project 112

    20/32

    4 Audio Stream

    When data has been encoded, the data streams can be transmitted or stored separately and need

    to be composed at the receiving end. Media objects are organized in a hierarchical manner to

    form audio-visual scenes. Due to this organizational manner, the media objects, each object

    can be described or encoded independently of other objects in the scene e.g. the background.

    MPEG-4/BiFS:

    1 Allows users to change their view point in a 3D scene or to interact with media objects.

    Allows different objects in the same scene to be coded at different levels of quality.

    MPEG-4 Systems also addresses:

    1 1. A standard file format to enable the exchange and authoring of MPEG-4 content

    2 2.Interactivity (both client-side and server-side

    3 3.MPEG-J (MPEG-4 & Java)

    4 4. FlexMux tool which allows for the interleaving of multiple streams into a single

    stream.

    Profiles have been developed to create conformance points for MPEG-4 tools and toolsets,

    therefore interoperability of MPEG-4 products with the same Profiles and Levels can be

    assured.

    A Profile is a subset of the MPEG-4 Systems, Visual or Audio tools set and is used for specific

    applications. It limits the tool set a decoder has to implement, therefore many applications only

    need a portion of the MPEG-4 toolset. Profiles specified in the MPEG-4 standard include:

    1 a. Visual Profile

    2 b. Natural Profile

    3 c. Synthetic & Natural/Synthetic Hybrid Profiles

    4 d. Audio Profile

    5 e. Graphic Profile

    6 f. Scene Graph Profile

    20

  • 8/8/2019 Final Project 112

    21/32

    The systems part of the MPEG-4 addresses the description of the relationship between the

    audio-visual components that constitute a scene. The relationship is described at two main

    levels.

    The Binary Format for Scenes (BIFS) describes the spatio-temporal arrangements of

    the objects in the scene. Viewers may have the possibility of interacting with the

    objects, e.g. by rearranging them on the scene or by changing their own point of view in

    a 3D virtual environment. The scene description provides a rich set of nodes for 2-D

    and 3-D composition operators and graphics primitives.

    At a lower level, Object Descriptors (ODs) define the relationship between the

    Elementary Streams pertinent to each object (e.g. the audio and the video stream of a

    participant to a videoconference) ODs also provide additional information.

    7

    2.1.7 MPEG-7

    Introduction

    The MPEG standards are an evolving set of standards for video and audio compression. MPEG

    7 technology covers the most recent developments in multimedia search and retreival, designed

    to standardize the description of multimedia content supporting a wide range of applications

    including DVD, CD and HDTV.

    MPEG-7 is a seven-part specification, formally entitled Multimedia Content Description

    Interface. It provides standardized tools for describing multimedia content, which will enable

    searching, filtering and browsing of multimedia content.

    ISO 15938-1 Systems

    MPEG-7 descriptions exist in two formats:

    21

  • 8/8/2019 Final Project 112

    22/32

    Textual - XML which allows editing, searching and filtering of a multimedia description. The

    description can be located anywhere, not necessaryily with the content. Binary format -

    suitable for storing, transmitting and streaming delivery of the multimedia description.

    The MPEG-7 Systems provides the tools for:

    1

    a. The preparation of binary coded representation of the PEG-7 descriptions,

    b. For efficient storage and transmission.

    c. Transmission techniques (both textual and binary formats)

    d. Multiplexing of descriptions

    e. Synchronization of descriptions with content

    f. Intellectual property management and protectiong. Terminal architecture

    h. Normative interface

    Descriptions may represented in two forms:

    Textual (XML)

    Binary form (BiM Binary format for Metadata). Binary coded representation is

    useful for efficient storage and transmission of content.

    1 MPEG-7 data is obtained from transport or storage ,handed to delivery layer. This

    allows extraction of elementary streams (consisting of individually accessible chunks called

    access units) by undoing the transport/storage specific framing and multiplexing and retains

    timing information needed for synchronisation.

    2

    3 Elementary streams are forwarded to the compression layer where the schema streams

    (schemes describing strucure of MPEG-7 data) and partial or full description streams (streams

    describing the content) are decoded.

    22

  • 8/8/2019 Final Project 112

    23/32

    MPEG-7 tools

    MPEG-7 uses the following tools:

    Descriptor (D): It is a representation of a feature defined syntactically andsemantically. It could be that a unique object was described by several descriptors.

    Description Schemes (DS): Specify the structure and semantics of the relations

    between its components, these components can be descriptors (D) or description

    schemes (DS).

    Description Definition Language (DDL): It is based on XML language used to define

    the structural relations between descriptors. It allows the creation and modification of description schemes and also the creation of new descriptors (D).

    System tools: These tools deal with binarization, synchronization, transport and storage

    of descriptors.

    2.1.8 H.261

    H.261 is a ITU-T video coding standard, ratified in November 1988. Originally designed for

    transmission over ISDN lines on which data rates are multiples of 64 kbit/s. It is one member

    of the H.26x family of video coding standards in the domain of the ITU-T Video Coding

    Experts Group (VCEG). The coding algorithm was designed to be able to operate at video bit

    rates between 40 kbit/s and 2 Mbit/s. The standard supports two video frame sizes: CIF

    (352x288 luma with 176x144 chroma) and QCIF (176x144 with 88x72 chroma) using a 4:2:0

    sampling scheme. It also has a backward-compatible trick for sending still picture graphics

    with 704x576 luma resolution and 352x288 chroma resolution (which was added in a later

    revision in 1993.

    Different steps follow by this standard is:

    23

  • 8/8/2019 Final Project 112

    24/32

    1. Loop filter

    The prediction process may be modified by a two-dimensional spatial filter (FIL) which

    operates on pixels within a predicted 8 by 8 block..The filter is separable into one-dimensional

    horizontal and vertical functions. Both are non-recursive with coefficients of 1/4, 1/2, 1/4

    except at block edges where one of the taps would fall outside the block. In such cases the 1-D

    filter is changed to have coefficients of 0, 1, 0. Full arithmetic precision is retained with

    rounding to 8 bit integer values at the 2-D filter output. Values whose fractional part is one half

    are rounded up. The filter is switched on/off for all six blocks in a macro block according to the

    macro block type.

    2. Transformer

    Transmitted blocks are first processed by a separable two-dimensional discrete cosine

    transform of size 8 by 8. The output from the inverse transform ranges from 256 to +255 after

    clipping to be represented with 9 bits. The transfer function of the inverse transform is given

    by:

    NOTE Within the block being transformed, x = 0 and y = 0 refer to the pel nearest the left

    and top edges of the picture, respectively.

    The arithmetic procedures for computing the transforms are not defined, but the inverse one

    should meet the error Tolerance.

    3. Quantization

    The number of quantizers is 1 for the INTRA dc coefficient and 31 for all other coefficients.

    Within a macro block the same quantizer is used for all coefficients except the INTRA dc one.

    The decision levels are not defined. The INTRA dc coefficient is nominally the transform

    value linearly quantized with a step size of 8 and no dead-zone. Each of the other 31 quantizers

    24

  • 8/8/2019 Final Project 112

    25/32

    is also nominally linear but with a central dead-zone around zero and with a step size of an

    even value in the range 2 to 62.

    Clipping of reconstructed picture

    To prevent quantization distortion of transform coefficient amplitudes causing arithmetic

    overflow in the encoder and decoder loops, clipping functions are inserted. The clipping

    function is applied to the reconstructed picture which is formed by summing the prediction and

    the prediction error as modified by the coding process. This clipper operates on resulting pel

    values less than 0 or greater than 255, changing them to 0 and 255, respectively.

    Different Tools have been developed already:

    Video compressor AVI Compressor

    MP4 Compressor MPEG Compressor

    3GP Compressor You Tube Compressor

    iPod Compressor Flash Video Compressor

    Quick Time Compressor

    WMV Compressor

    MKV Compressor

    VOB Compressor DVD Compressor

    25

  • 8/8/2019 Final Project 112

    26/32

    2.2 Methodology Adopted

    H.261

    H.261 is a ITU-T video coding standard, ratified in November 1988. Originally designed for

    transmission over ISDN lines on which data rates are multiples of 64 kbit/s. It is one member

    of the H.26x family of video coding standards in the domain of the ITU-T Video Coding

    Experts Group (VCEG). The coding algorithm was designed to be able to operate at video bit

    rates between 40 kbit/s and 2 Mbit/s. The standard supports two video frame sizes: CIF(352x288 luma with 176x144 chroma) and QCIF (176x144 with 88x72 chroma) using a 4:2:0

    sampling scheme. It also has a backward-compatible trick for sending still picture graphics

    with 704x576 luma resolution and 352x288 chroma resolution (which was added in a later

    revision in 1993.

    Different steps follow by this standard is:

    1.Loop filterThe prediction process may be modified by a two-dimensional spatial filter (FIL) which

    operates on pixels within a predicted 8 by 8 block..The filter is separable into one-dimensional

    horizontal and vertical functions. Both are non-recursive with coefficients of 1/4, 1/2, 1/4

    except at block edges where one of the taps would fall outside the block. In such cases the 1-D

    filter is changed to have coefficients of 0, 1, 0. Full arithmetic precision is retained with

    rounding to 8 bit integer values at the 2-D filter output. Values whose fractional part is one half

    are rounded up. The filter is switched on/off for all six blocks in a macro block according to the

    macro block type.

    26

  • 8/8/2019 Final Project 112

    27/32

    2.Transformer

    Transmitted blocks are first processed by a separable two-dimensional discrete cosine

    transform of size 8 by 8. The output from the inverse transform ranges from 256 to +255 after

    clipping to be represented with 9 bits. The transfer function of the inverse transform is given

    by:

    NOTE Within the block being transformed, x = 0 and y = 0 refer to the pel nearest the left

    and top edges of the picture, respectively.

    The arithmetic procedures for computing the transforms are not defined, but the inverse one

    should meet the errorTolerance.

    3. Quantization

    The number of quantizers is 1 for the INTRA dc coefficient and 31 for all other coefficients.

    Within a macro block the same quantizer is used for all coefficients except the INTRA dc one.

    The decision levels are not defined. The INTRA dc coefficient is nominally the transform

    value linearly quantized with a step size of 8 and no dead-zone. Each of the other 31 quantizers

    is also nominally linear but with a central dead-zone around zero and with a step size of an

    even value in the range 2 to 62.

    Clipping of reconstructed picture

    To prevent quantization distortion of transform coefficient amplitudes causing arithmetic

    overflow in the encoder and decoder loops, clipping functions are inserted. The clipping

    function is applied to the reconstructed picture which is formed by summing the prediction and

    the prediction error as modified by the coding process. This clipper operates on resulting pel

    values less than 0 or greater than 255, changing them to 0 and 255, respectively

    27

  • 8/8/2019 Final Project 112

    28/32

    Chapter -3 Project Estimation and Implementation Plan3.1 Cost and Benefit Analysis

    3.1.1 ECONOMICAL

    Economic analysis is the most frequently used method for evaluating the candidate system.

    More commonly known as cost of Benefit Analysis, the procedure is to determine the

    benefits and savings that are expected from the candidate system and compare them with the

    costs. If benefit outweighs the cost then the decision is made to design and implementation

    otherwise further justification or alterations are made in the proposed system.

    This project doesn't have many hardware requirements, thus, it requires less costing to install

    the software on the whole.

    Though, from the point of economy, the manual handling of the hardware component is

    much cheaper and best as compared to computerized systems. This approach normally

    works very well in any ordinary organization . The major problem starts when the no. of

    hardware components are starts growing with a time. Manual system needs variousregisters/books to maintain the daily complain entry, hardware entry done. In case of any

    misplacement of hardware component, the concerned registers have to be searched for the

    verification of identifying the status of that component . It is very cumbersome job to

    maintain all these manually. So it is very easy to maintain all these in the proposed system.

    3.1.2 COST ANALYSIS

    The cost to conduct investigation was negligible, as the center manager

    and teachers of center provided most of information.

    The cost of essential hardware and software requirement is not very

    expensive.

    28

  • 8/8/2019 Final Project 112

    29/32

    Moreover hardware like Pentium Core PC and software like MATLAB

    are easily available in the market.

    3.1.3 BENEFITS AND SAVINGS-

    Cost of the maintenance of the proposed system is negligible. Money is saved as paper work is minimized.

    Records are easily entered and retrieved.

    Time is saved as all the work can be done by a simple mouse click.

    The proposed system is fully automated and hence easy to use. Since benefits out base the cost, hence our project is economically feasible.

    29

  • 8/8/2019 Final Project 112

    30/32

    3.2 Schedule Estimate

    This is the table of Activity and its estimated time duration, which are used to accomplish the

    project.

    Activity Completion

    Date

    Duration

    (In Days)

    Effort

    (in Manhours)A) Introduction 20 AUG 2010 20 250B) Problem Analysis 15 SEP 2010 25 370C) Project Estimation &

    Implementation Plan

    20 OCT 2010 35 520

    D) Research Design 10 NOV 2010 20 300E) System Interface Design 10 DEC 2010 30 450

    F) Coding 20 FEB 2011 30 600G) Experiments Specification 10 MAR 2011 20 300H) Conclusions 25 MAR 2011 15 20I) User Manual 10 APR 2011 15 20

    3.3 Gantt Chart

    A Gantt chart is a horizontal bar chart developed as a production control tool in 1917 by HenryL. Gantt, an American engineer and social scientist. Frequently used in project management, a

    Gantt chart provides a graphical illustration of a schedule that helps to plan, coordinate, andtrack specific tasks in a project.

    Gantt charts may be simple versions created on graph paper or more complex automatedversions created using project management applications such as Microsoft Project or Excel.

    A Gantt chart is constructed with a horizontal axis representing the total time span of the project, broken down into increments (for example, days, weeks, or months) and a vertical axis

    30

  • 8/8/2019 Final Project 112

    31/32

    representing the tasks that make up the project (for example, if the project is outfitting your computer with new software, the major tasks involved might be: conduct research, choosesoftware, install software). Horizontal bars of varying lengths represent the sequences, timing,and time span for each task. Using the same example, you would put "conduct research" at thetop of the vertical axis and draw a bar on the graph that represents the amount of time you

    expect to spend on the research, and then enter the other tasks below the first one andrepresentative bars at the points in time when you expect to undertake them. The bar spans mayoverlap, as, for example, you may conduct research and choose software during the same timespan. As the project progresses, secondary bars, arrowheads, or darkened bars may be added toindicate completed tasks, or the portions of tasks that have been completed. A vertical line isused to represent the report date.

    Scheduling of SDLC (GANNT CHART)

    In Weeks

    0 5 10 15 20 25 30 35

    Analysis

    Documentation

    Design.

    Coding

    Testing

    References

    31

  • 8/8/2019 Final Project 112

    32/32

    [1] HUFFMAN, D. A. (1951). A method for the construction of minimum redundancy codes.

    In the

    Proceedings of the Institute of Radio Engineers 40, pp. 1098-1101.

    [2] CAPON, J. (1959). A probabilistie model for run-length coding of pictures. IRE Trans. On

    Information

    Theory, IT-5, (4), pp. 157-163.

    [3] APOSTOLOPOULOS, J. G. (2004). Video Compression. Streaming Media Systems

    Group.

    [4] The Moving Picture Experts Group home page. (3. Feb. 2006)

    [5] CLARKE, R. J. Digital compression of still images and video. London: Academic press.

    1995, pp.

    [6] http://www.irf.uka.de/seminare/redundanz/vortrag15/.(3. Feb. 2006)

    [7] PEREIRA, F. The MPEG4 Standard: Evolution or Revolution

    [8] MANNING, C. The digital video site.

    [9] SEFERIDIS, V. E. GHANBARI, M. (1993). General approach to block-matching motion

    estimation.

    Optical Engineering, (32), pp. 1464-1474.

    [10] GHARAVI, H. MILLIS, M. (1990). Block matching motion estimation algorithms-new

    results. IEEE

    Transactions on Circuits and Systems, (37), pp. 649-651.

    [11] CHOI, W. Y. PARK R. H. (1989). Motion vector coding with conditional transmission.

    Signal

    Processing, (18). pp. 259-267.

    [12] Institut fr Informatik Universitt Karlsruhe .