Venkata Lakshmi 08011012170 Sep Audio Compression
-
Upload
venkatalakshmi-krishnasamy -
Category
Documents
-
view
216 -
download
0
Transcript of Venkata Lakshmi 08011012170 Sep Audio Compression
-
8/3/2019 Venkata Lakshmi 08011012170 Sep Audio Compression
1/8
NAAC ACCREDITED
DEPARTMENT OF COMPUTER SCIENCE ANDENGINEERING
XCS703 GRAPHICS AND MULTIMEDIA
SEP REPORT
AUDIO COMPRESSION
Submitted by
VENKATA LAKSHMI.K
08011012170
IV CSE C
1
DEPARTMENT OF COMPUTERSCIENCE & ENGINEERING
Periyar Nagar, Vallam Thanjavur - 613 403, Tamil Nadu, IndiaPhone: +91 - 4362 - 264600 Fax: +91- 4362 - 264660Email: [email protected] Web: www. pmu.edu
-
8/3/2019 Venkata Lakshmi 08011012170 Sep Audio Compression
2/8
AUDIOCOMPRESSION
INTRODUCTION
What is Compression?
o It is a process of deriving more compact (i.e., smaller) representations of data
Goal of Compression
o Significant reduction in the data size to reduce the storage/bandwidth
requirements
Constraints on Compression
o Perfect or near-perfect reconstruction (lossless/lossy)
Strategies for Compression
o Reducing redundancies
o Exploiting the characteristics of human vision
MOTIVATION FOR COMPRESSION
Massive Amounts of Data Involved in Storage/Transmission of Text, Sound, Images,
and Videos in Many Applications
Applications
o Medical imaging
o Teleradiology
o Space/Satellite imaging
o Multimedia
o Digital Video: entertainment, home use
o Digital photography
Concrete Figures
o A typical hospital generates close to 1 terabits per year
o NASA's EOS will generate 1 terabytes per day
o One 2-hour video = 1.3 terabits
o Video transmission speed = 180Mb/sec
o With MPEG1 (1.5Mb/s), need compression ratio of 120
o With MPEG2 (4-10Mb/s), need comp. ratio of 18-45
BASIC DEFINITIONS IN COMPRESSION/CODINGCoding: Compression
2
-
8/3/2019 Venkata Lakshmi 08011012170 Sep Audio Compression
3/8
Codeword: A binary string representing either the whole coded data or one coded data
symbol
Coded Bitstream: the binary string representing the whole coded data.
Lossless Compression: 100% accurate reconstruction of the original data
Lossy Compression: The reconstruction involves errors which may or may not be
tolerable Bit Rate: Average number of bits per original data element after compression
BASIC SOUND DEFINITIONS
A sound/audio (digital) signal is a sequence of values, called samples, where every
sample is the intensity of the recorded sound at the corresponding moment in time.
The sampling rate of a sound is the number of samples per second.
The CD quality sampling rate is 44.1K samples per second (or 44.1 KHz), usually at
16 bits per sample, though 24 bits per sample is now common.
In digital sound used for miniDV, digital TV, and DVD, the sampling rate is 48 KHz.
In DVD-Audio and in Blu-ray audio tracks, the sampling rate is 96 KHz or 192 KHz.
When the sampling rate is infinity, the signal becomes what is called "analog signal".
When signal is captured as an analog signal, it can be converted to a digital signal by
an analog-to-digital converter (also called A/D converter or digitizer)
If a digital signal is fed into a digital-to-analog converter (also called D/A converter),
the output is obviously an analog signal.
Modulators are D/A converters, and demodulators are A/D converters. So, a modem is
both an A/D and D/A converter.
STRATEGIES FOR COMPRESSION: REDUNDANCY REDUCTION
Symbol-Level Representation Redundancy
Different symbols occur with different frequencies
Variable-length codes vs. fixed-length codes
Frequent symbols are better coded with short codes
Infrequent symbols are coded with long codes
Example Techniques: Huffman Coding
3
-
8/3/2019 Venkata Lakshmi 08011012170 Sep Audio Compression
4/8
Block-Level Representation Redundancy
Different blocks of data occur with varying frequencies
Better then to code blocks than individual symbols
The block size can be fixed or variable
The block-code size can be fixed or variable
Frequent blocks are better coded with short codes
Example techniques: Block-oriented Huffman, Run-Length Encoding (RLE),
Arithmetic Coding, Lempil-Ziv (LZ)
Inter-Pixel Spatial Redundancy
Neighboring pixels tend to have similar values
Neighboring pixels tend to exhibit high correlations
Techniques: Decorrelation and/or processing in the frequency domain
Spatial decorrelation converts correlations into symbol- or block-redundancy
Frequency domain processing addresses visual redundancy (see the next slide)
Inter-Pixel Temporal Redundancy (in Video)
Often, the majority of corresponding pixels in successive video-frames are identical
over long spans of frames
Due to motion, blocks of pixels change in position but not in values between
successive frames
Thus, block-oriented motion-compensated redundancy reduction techniques are used
for video compression.
Visual Redundancy
The human visual system (HVS) has certain limitations that make many image
contents invisible. Those contents, termed visually redundant, are the target of
removal in lossy compression.
In fact, the HVS can see within a small range of spatial frequencies: 1-60 cycles/arc-
degree
4
-
8/3/2019 Venkata Lakshmi 08011012170 Sep Audio Compression
5/8
Approach for reducing visual redundancy in lossy compression
1. Transform: Convert the data to the frequency domain
2. Quantize: Under-represent the high frequencies
3. Losslessly compress the quantized data
Audiocompression
Audiocompression is a form of data compression designed to reduce the transmission bandwidth
requirement of digital audio streams and the storage size ofaudio files. Audio compression algorithms
are implemented in computer software as audio codecs. Generic data compression algorithms perform
poorly with audio data, seldom reducing data size much below 87% from the original, [citation needed] and
are not designed for use in real time applications. Consequently, specifically optimized audio losslessand lossy algorithms have been created. Lossy algorithms provide greater compression rates and are
used in mainstream consumer audio devices.In both lossy and lossless compression, information
redundancy is reduced, using methods such as coding, pattern recognition and linear prediction to
reduce the amount of information used to represent the uncompressed data.
Audio compression can mean two things:
1. Audio data compression - in which the amount of data in a recorded waveform is reduced
for transmission. This is used in CD and MP3 encoding, internet radio, and the like.
2. Audio level compression - in which the dynamic range (difference between loud and quiet)
of an audio waveform is reduced. This is used in guitar effects racks, recording studios, etc.
Audio compression consist of analog signal of different frequencies
The audio signals are converted to digital form and then processed,stored or
transmitted
Compression allows storing the audio in significantly less space and if there is need to
transmit the signal
ADAPTIVE DIFFERENTIAL PULSE CODE MODULATION
ADPCM provides a form of compression by encoding and storing in the data stream
only the difference between the values of successive samples.
This provides a wide dynamic range while reducing the size of the data stream.
5
http://en.wikipedia.org/wiki/Data_compressionhttp://en.wikipedia.org/wiki/Audio_streamhttp://en.wikipedia.org/wiki/Audio_filehttp://en.wikipedia.org/wiki/Algorithmhttp://en.wikipedia.org/wiki/Computer_softwarehttp://en.wikipedia.org/wiki/Audio_codechttp://en.wikipedia.org/wiki/Wikipedia:Citation_neededhttp://en.wikipedia.org/wiki/Wikipedia:Citation_neededhttp://en.wikipedia.org/wiki/Wikipedia:Citation_neededhttp://en.wikipedia.org/wiki/Lossless_data_compressionhttp://en.wikipedia.org/wiki/Lossy_data_compressionhttp://en.wikipedia.org/wiki/Redundancy_(information_theory)http://en.wikipedia.org/wiki/Redundancy_(information_theory)http://en.wikipedia.org/wiki/Coding_theoryhttp://en.wikipedia.org/wiki/Pattern_recognitionhttp://en.wikipedia.org/wiki/Linear_predictionhttp://en.wikipedia.org/wiki/Audio_data_compressionhttp://en.wikipedia.org/wiki/Audio_level_compressionhttp://en.wikipedia.org/wiki/Dynamic_rangehttp://en.wikipedia.org/wiki/Data_compressionhttp://en.wikipedia.org/wiki/Audio_streamhttp://en.wikipedia.org/wiki/Audio_filehttp://en.wikipedia.org/wiki/Algorithmhttp://en.wikipedia.org/wiki/Computer_softwarehttp://en.wikipedia.org/wiki/Audio_codechttp://en.wikipedia.org/wiki/Wikipedia:Citation_neededhttp://en.wikipedia.org/wiki/Lossless_data_compressionhttp://en.wikipedia.org/wiki/Lossy_data_compressionhttp://en.wikipedia.org/wiki/Redundancy_(information_theory)http://en.wikipedia.org/wiki/Redundancy_(information_theory)http://en.wikipedia.org/wiki/Coding_theoryhttp://en.wikipedia.org/wiki/Pattern_recognitionhttp://en.wikipedia.org/wiki/Linear_predictionhttp://en.wikipedia.org/wiki/Audio_data_compressionhttp://en.wikipedia.org/wiki/Audio_level_compressionhttp://en.wikipedia.org/wiki/Dynamic_range -
8/3/2019 Venkata Lakshmi 08011012170 Sep Audio Compression
6/8
For instance,normal speed has long sections of silence that can be compressed because
there are no changes across a number of samples.
Audio compression is a form ofdata compression designed to reduce the size ofaudio files.
Audio compression algorithms are implemented in computer software as audio codecs.
Generic data compression algorithms work with audio data but the compression ratios are low
(around 50-60% of original size) and they do not work in real time and are therefore not
practical. Specific audio "lossless" and "lossy" algorithms have been created. Lossy
algorithms provide far greater compression ratios and are used in mainstream consumer audio
devices.
As with image compression, both lossy and lossless compression algorithms are used in audio
compression, lossy being the most practical for everyday use. In both lossy and lossless
compression, information redundancy is reduced, using methods such as coding, pattern
recognition and linear prediction to reduce the amount of information used to describe the
data.
For example, suppose you wanted to record twenty house numbers along one side of a street,
each of which goes up by 2. If the first address was 14461, or five digits, the uncompressed
stream would require 20 times 5 bytes, or 100 bytes, to store. You could recode that to take
advantage of the repetition and simply say begin at 14461, increase by 2, repeat 19 times.Now the data are losslessly captured in a smaller space. In practice the pattern recognition for
lossy and lossless compression is far more complex than that.
The trade-off of slightly reduced audio quality is clearly outweighed for most practical audio
applications where users cannot perceive any difference and space requirements are
substantially reduced. For example, on one CD, one can fit an hour of high fidelity music,
less than 2 hours of music compressed losslessly, or 7 hours of music compressed in MP3
format.
Lossless audio codecs have no quality issues, so the usability can be estimated by
Speed of compression and decompression
Degree of compression
Software and hardware support
Robustness and error correction
Lossy audio compression is used in an extremely wide range of applications. In addition to
the direct applications (mp3 players or computers), digitally compressed audio streams areused in most video DVDs; digital television; streaming media on the internet; satellite and
6
http://en.wikipedia.org/wiki/Data_compressionhttp://en.wikipedia.org/wiki/Audio_filehttp://en.wikipedia.org/wiki/Algorithmhttp://en.wikipedia.org/wiki/Codechttp://en.wikipedia.org/wiki/Data_compressionhttp://en.wikipedia.org/wiki/Lossless_data_compressionhttp://en.wikipedia.org/wiki/Lossy_data_compressionhttp://en.wikipedia.org/wiki/Image_compressionhttp://en.wikipedia.org/wiki/Internethttp://en.wikipedia.org/wiki/Data_compressionhttp://en.wikipedia.org/wiki/Audio_filehttp://en.wikipedia.org/wiki/Algorithmhttp://en.wikipedia.org/wiki/Codechttp://en.wikipedia.org/wiki/Data_compressionhttp://en.wikipedia.org/wiki/Lossless_data_compressionhttp://en.wikipedia.org/wiki/Lossy_data_compressionhttp://en.wikipedia.org/wiki/Image_compressionhttp://en.wikipedia.org/wiki/Internet -
8/3/2019 Venkata Lakshmi 08011012170 Sep Audio Compression
7/8
cable radio; and increasingly in terrestrial radio broadcasts. Lossy compression typically
achieves far greater compression than lossless compression (data of 5 percent to 20 percent of
the original stream, rather than 50 percent to 60 percent), by discarding less-critical data.
The innovation of lossy audio compression was to usepsychoacoustics to recognize that not
all data in an audio stream can be perceived by the human auditory system. Most lossy
compression reduces perceptual redundancy by first identifying sounds which are considered
perceptually irrelevant, that is, sounds that are very hard to hear. Typical examples include
high frequencies, or sounds that occur at the same time as other louder sounds. Those sounds
are coded with decreased accuracy or not coded at all.
7
http://en.wikipedia.org/wiki/Psychoacousticshttp://en.wikipedia.org/wiki/Psychoacoustics -
8/3/2019 Venkata Lakshmi 08011012170 Sep Audio Compression
8/8
References
1) http://en.wikipedia.org/wiki/Audio_compression_(data)
2) http://www.bdti.com/MyBDTI/pubs/audio_icspat00.pdf
3) www.monkeysaudio.com/
4) http://computing.unn.ac.uk/staff/cgpv1/cm613/lecture_material.htm
5) http://computing.unn.ac.uk/staff/cgpv1/cm613/files/Audio%20Compression.pdf
6) http://computing.unn.ac.uk/staff/cgpv1/cm613/files/mpegaud.pdf
7) http://www.bdti.com/MyBDTI/pubs/audio_icspat00.pdf
8
http://en.wikipedia.org/wiki/Audio_compression_(data)http://www.bdti.com/MyBDTI/pubs/audio_icspat00.pdfhttp://www.monkeysaudio.com/http://computing.unn.ac.uk/staff/cgpv1/cm613/lecture_material.htmhttp://computing.unn.ac.uk/staff/cgpv1/cm613/files/Audio%20Compression.pdfhttp://computing.unn.ac.uk/staff/cgpv1/cm613/files/mpegaud.pdfhttp://www.bdti.com/MyBDTI/pubs/audio_icspat00.pdfhttp://en.wikipedia.org/wiki/Audio_compression_(data)http://www.bdti.com/MyBDTI/pubs/audio_icspat00.pdfhttp://www.monkeysaudio.com/http://computing.unn.ac.uk/staff/cgpv1/cm613/lecture_material.htmhttp://computing.unn.ac.uk/staff/cgpv1/cm613/files/Audio%20Compression.pdfhttp://computing.unn.ac.uk/staff/cgpv1/cm613/files/mpegaud.pdfhttp://www.bdti.com/MyBDTI/pubs/audio_icspat00.pdf