Venkata Lakshmi 08011012170 Sep Audio Compression

8/3/2019 Venkata Lakshmi 08011012170 Sep Audio Compression

1/8

NAAC ACCREDITED

DEPARTMENT OF COMPUTER SCIENCE ANDENGINEERING

XCS703 GRAPHICS AND MULTIMEDIA

SEP REPORT

AUDIO COMPRESSION

Submitted by

VENKATA LAKSHMI.K

08011012170

IV CSE C

1

DEPARTMENT OF COMPUTERSCIENCE & ENGINEERING

Periyar Nagar, Vallam Thanjavur - 613 403, Tamil Nadu, IndiaPhone: +91 - 4362 - 264600 Fax: +91- 4362 - 264660Email: [email protected] Web: www. pmu.edu


2/8

AUDIOCOMPRESSION

INTRODUCTION

What is Compression?

o It is a process of deriving more compact (i.e., smaller) representations of data

Goal of Compression

o Significant reduction in the data size to reduce the storage/bandwidth

requirements

Constraints on Compression

o Perfect or near-perfect reconstruction (lossless/lossy)

Strategies for Compression

o Reducing redundancies

o Exploiting the characteristics of human vision

MOTIVATION FOR COMPRESSION

Massive Amounts of Data Involved in Storage/Transmission of Text, Sound, Images,

and Videos in Many Applications

Applications

o Medical imaging

o Teleradiology

o Space/Satellite imaging

o Multimedia

o Digital Video: entertainment, home use

o Digital photography

Concrete Figures

o A typical hospital generates close to 1 terabits per year

o NASA's EOS will generate 1 terabytes per day

o One 2-hour video = 1.3 terabits

o Video transmission speed = 180Mb/sec

o With MPEG1 (1.5Mb/s), need compression ratio of 120

o With MPEG2 (4-10Mb/s), need comp. ratio of 18-45

BASIC DEFINITIONS IN COMPRESSION/CODINGCoding: Compression

2


3/8

Codeword: A binary string representing either the whole coded data or one coded data

symbol

Coded Bitstream: the binary string representing the whole coded data.

Lossless Compression: 100% accurate reconstruction of the original data

Lossy Compression: The reconstruction involves errors which may or may not be

tolerable Bit Rate: Average number of bits per original data element after compression

BASIC SOUND DEFINITIONS

A sound/audio (digital) signal is a sequence of values, called samples, where every

sample is the intensity of the recorded sound at the corresponding moment in time.

The sampling rate of a sound is the number of samples per second.

The CD quality sampling rate is 44.1K samples per second (or 44.1 KHz), usually at

16 bits per sample, though 24 bits per sample is now common.

In digital sound used for miniDV, digital TV, and DVD, the sampling rate is 48 KHz.

In DVD-Audio and in Blu-ray audio tracks, the sampling rate is 96 KHz or 192 KHz.

When the sampling rate is infinity, the signal becomes what is called "analog signal".

When signal is captured as an analog signal, it can be converted to a digital signal by

an analog-to-digital converter (also called A/D converter or digitizer)

If a digital signal is fed into a digital-to-analog converter (also called D/A converter),

the output is obviously an analog signal.

Modulators are D/A converters, and demodulators are A/D converters. So, a modem is

both an A/D and D/A converter.

STRATEGIES FOR COMPRESSION: REDUNDANCY REDUCTION

Symbol-Level Representation Redundancy

Different symbols occur with different frequencies

Variable-length codes vs. fixed-length codes

Frequent symbols are better coded with short codes

Infrequent symbols are coded with long codes

Example Techniques: Huffman Coding

3


4/8

Block-Level Representation Redundancy

Different blocks of data occur with varying frequencies

Better then to code blocks than individual symbols

The block size can be fixed or variable

The block-code size can be fixed or variable

Frequent blocks are better coded with short codes

Example techniques: Block-oriented Huffman, Run-Length Encoding (RLE),

Arithmetic Coding, Lempil-Ziv (LZ)

Inter-Pixel Spatial Redundancy

Neighboring pixels tend to have similar values

Neighboring pixels tend to exhibit high correlations

Techniques: Decorrelation and/or processing in the frequency domain

Spatial decorrelation converts correlations into symbol- or block-redundancy

Frequency domain processing addresses visual redundancy (see the next slide)

Inter-Pixel Temporal Redundancy (in Video)

Often, the majority of corresponding pixels in successive video-frames are identical

over long spans of frames

Due to motion, blocks of pixels change in position but not in values between

successive frames

Thus, block-oriented motion-compensated redundancy reduction techniques are used

for video compression.

Visual Redundancy

The human visual system (HVS) has certain limitations that make many image

contents invisible. Those contents, termed visually redundant, are the target of

removal in lossy compression.

In fact, the HVS can see within a small range of spatial frequencies: 1-60 cycles/arc-

degree

4


5/8

Approach for reducing visual redundancy in lossy compression

1. Transform: Convert the data to the frequency domain

2. Quantize: Under-represent the high frequencies

3. Losslessly compress the quantized data

Audiocompression

Audiocompression is a form of data compression designed to reduce the transmission bandwidth

requirement of digital audio streams and the storage size ofaudio files. Audio compression algorithms

are implemented in computer software as audio codecs. Generic data compression algorithms perform

poorly with audio data, seldom reducing data size much below 87% from the original, [citation needed] and

are not designed for use in real time applications. Consequently, specifically optimized audio losslessand lossy algorithms have been created. Lossy algorithms provide greater compression rates and are

used in mainstream consumer audio devices.In both lossy and lossless compression, information

redundancy is reduced, using methods such as coding, pattern recognition and linear prediction to

reduce the amount of information used to represent the uncompressed data.

Audio compression can mean two things:

1. Audio data compression - in which the amount of data in a recorded waveform is reduced

for transmission. This is used in CD and MP3 encoding, internet radio, and the like.

2. Audio level compression - in which the dynamic range (difference between loud and quiet)

of an audio waveform is reduced. This is used in guitar effects racks, recording studios, etc.

Audio compression consist of analog signal of different frequencies

The audio signals are converted to digital form and then processed,stored or

transmitted

Compression allows storing the audio in significantly less space and if there is need to

transmit the signal

ADAPTIVE DIFFERENTIAL PULSE CODE MODULATION

ADPCM provides a form of compression by encoding and storing in the data stream

only the difference between the values of successive samples.

This provides a wide dynamic range while reducing the size of the data stream.

5
http://en.wikipedia.org/wiki/Data_compressionhttp://en.wikipedia.org/wiki/Audio_streamhttp://en.wikipedia.org/wiki/Audio_filehttp://en.wikipedia.org/wiki/Algorithmhttp://en.wikipedia.org/wiki/Computer_softwarehttp://en.wikipedia.org/wiki/Audio_codechttp://en.wikipedia.org/wiki/Wikipedia:Citation_neededhttp://en.wikipedia.org/wiki/Wikipedia:Citation_neededhttp://en.wikipedia.org/wiki/Wikipedia:Citation_neededhttp://en.wikipedia.org/wiki/Lossless_data_compressionhttp://en.wikipedia.org/wiki/Lossy_data_compressionhttp://en.wikipedia.org/wiki/Redundancy_(information_theory)http://en.wikipedia.org/wiki/Redundancy_(information_theory)http://en.wikipedia.org/wiki/Coding_theoryhttp://en.wikipedia.org/wiki/Pattern_recognitionhttp://en.wikipedia.org/wiki/Linear_predictionhttp://en.wikipedia.org/wiki/Audio_data_compressionhttp://en.wikipedia.org/wiki/Audio_level_compressionhttp://en.wikipedia.org/wiki/Dynamic_rangehttp://en.wikipedia.org/wiki/Data_compressionhttp://en.wikipedia.org/wiki/Audio_streamhttp://en.wikipedia.org/wiki/Audio_filehttp://en.wikipedia.org/wiki/Algorithmhttp://en.wikipedia.org/wiki/Computer_softwarehttp://en.wikipedia.org/wiki/Audio_codechttp://en.wikipedia.org/wiki/Wikipedia:Citation_neededhttp://en.wikipedia.org/wiki/Lossless_data_compressionhttp://en.wikipedia.org/wiki/Lossy_data_compressionhttp://en.wikipedia.org/wiki/Redundancy_(information_theory)http://en.wikipedia.org/wiki/Redundancy_(information_theory)http://en.wikipedia.org/wiki/Coding_theoryhttp://en.wikipedia.org/wiki/Pattern_recognitionhttp://en.wikipedia.org/wiki/Linear_predictionhttp://en.wikipedia.org/wiki/Audio_data_compressionhttp://en.wikipedia.org/wiki/Audio_level_compressionhttp://en.wikipedia.org/wiki/Dynamic_range


6/8

For instance,normal speed has long sections of silence that can be compressed because

there are no changes across a number of samples.

Audio compression is a form ofdata compression designed to reduce the size ofaudio files.

Audio compression algorithms are implemented in computer software as audio codecs.

Generic data compression algorithms work with audio data but the compression ratios are low

(around 50-60% of original size) and they do not work in real time and are therefore not

practical. Specific audio "lossless" and "lossy" algorithms have been created. Lossy

algorithms provide far greater compression ratios and are used in mainstream consumer audio

devices.

As with image compression, both lossy and lossless compression algorithms are used in audio

compression, lossy being the most practical for everyday use. In both lossy and lossless

compression, information redundancy is reduced, using methods such as coding, pattern

recognition and linear prediction to reduce the amount of information used to describe the

data.

For example, suppose you wanted to record twenty house numbers along one side of a street,

each of which goes up by 2. If the first address was 14461, or five digits, the uncompressed

stream would require 20 times 5 bytes, or 100 bytes, to store. You could recode that to take

advantage of the repetition and simply say begin at 14461, increase by 2, repeat 19 times.Now the data are losslessly captured in a smaller space. In practice the pattern recognition for

lossy and lossless compression is far more complex than that.

The trade-off of slightly reduced audio quality is clearly outweighed for most practical audio

applications where users cannot perceive any difference and space requirements are

substantially reduced. For example, on one CD, one can fit an hour of high fidelity music,

less than 2 hours of music compressed losslessly, or 7 hours of music compressed in MP3

format.

Lossless audio codecs have no quality issues, so the usability can be estimated by

Speed of compression and decompression

Degree of compression

Software and hardware support

Robustness and error correction

Lossy audio compression is used in an extremely wide range of applications. In addition to

the direct applications (mp3 players or computers), digitally compressed audio streams areused in most video DVDs; digital television; streaming media on the internet; satellite and

6
http://en.wikipedia.org/wiki/Data_compressionhttp://en.wikipedia.org/wiki/Audio_filehttp://en.wikipedia.org/wiki/Algorithmhttp://en.wikipedia.org/wiki/Codechttp://en.wikipedia.org/wiki/Data_compressionhttp://en.wikipedia.org/wiki/Lossless_data_compressionhttp://en.wikipedia.org/wiki/Lossy_data_compressionhttp://en.wikipedia.org/wiki/Image_compressionhttp://en.wikipedia.org/wiki/Internethttp://en.wikipedia.org/wiki/Data_compressionhttp://en.wikipedia.org/wiki/Audio_filehttp://en.wikipedia.org/wiki/Algorithmhttp://en.wikipedia.org/wiki/Codechttp://en.wikipedia.org/wiki/Data_compressionhttp://en.wikipedia.org/wiki/Lossless_data_compressionhttp://en.wikipedia.org/wiki/Lossy_data_compressionhttp://en.wikipedia.org/wiki/Image_compressionhttp://en.wikipedia.org/wiki/Internet


7/8

cable radio; and increasingly in terrestrial radio broadcasts. Lossy compression typically

achieves far greater compression than lossless compression (data of 5 percent to 20 percent of

the original stream, rather than 50 percent to 60 percent), by discarding less-critical data.

The innovation of lossy audio compression was to usepsychoacoustics to recognize that not

all data in an audio stream can be perceived by the human auditory system. Most lossy

compression reduces perceptual redundancy by first identifying sounds which are considered

perceptually irrelevant, that is, sounds that are very hard to hear. Typical examples include

high frequencies, or sounds that occur at the same time as other louder sounds. Those sounds

are coded with decreased accuracy or not coded at all.

7
http://en.wikipedia.org/wiki/Psychoacousticshttp://en.wikipedia.org/wiki/Psychoacoustics


8/8

References

1) http://en.wikipedia.org/wiki/Audio_compression_(data)

2) http://www.bdti.com/MyBDTI/pubs/audio_icspat00.pdf

3) www.monkeysaudio.com/

4) http://computing.unn.ac.uk/staff/cgpv1/cm613/lecture_material.htm

5) http://computing.unn.ac.uk/staff/cgpv1/cm613/files/Audio%20Compression.pdf

6) http://computing.unn.ac.uk/staff/cgpv1/cm613/files/mpegaud.pdf

7) http://www.bdti.com/MyBDTI/pubs/audio_icspat00.pdf

8
http://en.wikipedia.org/wiki/Audio_compression_(data)http://www.bdti.com/MyBDTI/pubs/audio_icspat00.pdfhttp://www.monkeysaudio.com/http://computing.unn.ac.uk/staff/cgpv1/cm613/lecture_material.htmhttp://computing.unn.ac.uk/staff/cgpv1/cm613/files/Audio%20Compression.pdfhttp://computing.unn.ac.uk/staff/cgpv1/cm613/files/mpegaud.pdfhttp://www.bdti.com/MyBDTI/pubs/audio_icspat00.pdfhttp://en.wikipedia.org/wiki/Audio_compression_(data)http://www.bdti.com/MyBDTI/pubs/audio_icspat00.pdfhttp://www.monkeysaudio.com/http://computing.unn.ac.uk/staff/cgpv1/cm613/lecture_material.htmhttp://computing.unn.ac.uk/staff/cgpv1/cm613/files/Audio%20Compression.pdfhttp://computing.unn.ac.uk/staff/cgpv1/cm613/files/mpegaud.pdfhttp://www.bdti.com/MyBDTI/pubs/audio_icspat00.pdf

Venkata Lakshmi 08011012170 Sep Audio Compression

Documents

Transcript of Venkata Lakshmi 08011012170 Sep Audio Compression