CS1315: Introduction to Media Computation Sound Encoding and Manipulation.
-
Upload
hubert-kerry-sherman -
Category
Documents
-
view
237 -
download
2
Transcript of CS1315: Introduction to Media Computation Sound Encoding and Manipulation.
CS1315:Introduction to Media Computation
Sound Encoding and Manipulation
Sound is made when something vibrates
The vibration disturbs the air around it and makes changes in air pressure
These changes in air pressure move through the air as sound waves
The sound waves cause pressure changes against our ear drum sending nerve impulses to our brain
Sound waves are pressure waves
Each air molecule oscillates back and forth, affecting their neighbors…air molecules themselves don’t travel from vibration source to your ear
Creates areas of high and low pressure, called compressions and rarefactions
Sound waves are longitudinal
Representation of sound by a sine wave is merely an attempt to illustrate the sinusoidal nature of the pressure-time fluctuations
Properties of sound waves
Pressure fluctuation comes in cycles frequency of wave is number of cycles per second (cps), or Hertz
(Complex sounds have more than one frequency in them.) amplitude is maximum height of the wave (aka maximum pressure
fluctuation)
Each repetition of a waveform is called a cycle
Volume and pitch: Psychoacoustics, the psychology of sound
Our perception of pitch is related (logarithmically) to frequency Higher frequencies are perceived as higher pitches A above middle C is 440 Hz
Our perception of volume is related (logarithmically) to changes in amplitude If the amplitude doubles, it’s about a 3 decibel (dB) change
Humans and Pitch
In general we can hear frequencies between 20 Hz and 20,000 Hz
Hz = hertz = cycles per second 20,000 Hz = 20 kHz (kilohertz) The older we get, the less we can perceive really high
frequencies – recall those commercials for ‘silent’ ring-tones for teens
Many animals can make sounds and hear frequencies that are beyond what we can hear – hence the creation of dog whistles
“Logarithmically?”
It’s strange, but our hearing works on ratios not differences, e.g., for pitch. We hear the difference between 200 Hz and 400 Hz, as
the same as 500 Hz and 1000 Hz Similarly, 200 Hz to 600 Hz, and 1000 Hz to 3000 Hz
Intensity (volume) is measured as watts per meter squared A change from 0.1W/m2 to 0.01 W/m2, sounds the same
to us as 0.001W/m2 to 0.0001W/m2
Decibel is a logarithmic measure
A decibel is a ratio between two intensities: 10 * log10(I1/I2) As an absolute measure, it’s in comparison to threshold
of audibility 0 dB can’t be heard. Normal speech is 60 dB. A shout is about 80 dB
Digitizing Sound: How do we get that into numbers? Waves are analog (continuous) Remember in calculus,
estimating the curve by creating rectangles?
We can do the same to estimate the sound curve Analog-to-digital conversion
(ADC) will give us the amplitude at an instant as a number: a sample
How many samples do we need?
Remember…pictures are continuous. How do we represent them digitally?
Understanding how computers represent sound Consider how a film represents motion…
Movie is made by taking still photos in rapid sequence at a constant rate, usually twenty-four frames per second
When the photos are displayed in sequence at that same rate, it fools us into thinking we are seeing continuous motion, even though we are actually seeing twenty-four discrete images per second
http://music.arts.uci.edu/dobrian/digitalaudio.htm
http://en.wikipedia.org/wiki/Animation
A collection of still frames
How computers represent sound
Digital recording of sound works on the same principle We take many discrete samples of the sound wave's
instantaneous amplitude, store that information, then later reproduce those amplitudes at the same rate to create the illusion of a continuous wave
http://music.arts.uci.edu/dobrian/digitalaudio.htm
How often should we take samples?
Many many samples must be taken per second--many more than are necessary for filming a visual image
In fact, we need to take more than twice as many samples as the highest frequency we wish to record…enter Nyquist Theorem
The number of samples taken per second is known as the sampling rate
http://music.arts.uci.edu/dobrian/digitalaudio.htm
Nyquist Theorem* (will be on exam)
We need twice as many samples as the maximum frequency in order to represent (and recreate, later) the original sound.
The number of samples recorded per second is the sampling rate If we capture 8000 samples per second, the highest frequency
we can capture is 4000 Hz That’s how phones work
CD quality is 44,100 samples per second…what is the max frequency a CD can represent?
Nyquist Theorem
Think back to the example of a film camera, which shoots 24 frames per second If we're shooting a movie of a car, and the car wheel spins at a
rate greater than 12 revolutions per second, it's exceeding half the "sampling rate" of the camera… the wheel completes more than 1/2 revolution per frame.
If, for example it actually completes 18/24 of a revolution per frame, it will appear to be going backward at a rate of 6 revolutions per second
In other words, if we don't witness what happens between samples, a 270-degree revolution of the wheel is indistinguishable from a -90-degree revolution. The samples we obtain in the two cases are precisely the same
http://music.arts.uci.edu/dobrian/digitalaudio.htm
Nyquist Theorem For audio sampling, the phenomenon is
practically identical Consider a graph of a 4,000 Hz cosine wave,
being sampled at a rate of 22,050 Hz Sample is taken every 0. 045 milliseconds
http://music.arts.uci.edu/dobrian/digitalaudio.htm
Nyquist Theorem Consider same 4,000 Hz cosine wave sampled at
6,000 Hz Sample is taken every 0. 167 milliseconds Wave completes more than 1/2 cycle per sample, and
the resulting samples are indistinguishable from those that would be obtained from a 2,000 Hz cosine wave
http://music.arts.uci.edu/dobrian/digitalaudio.htm
Nyquist Theorem
The simple lesson to be learned from the Nyquist theorem: digital audio cannot accurately represent any frequency greater than half the sampling rate
If we want to record frequencies as high as 20,000 Hz, how many times per second would we need to sample the sound?
http://music.arts.uci.edu/dobrian/digitalaudio.htm
Digitizing sound with the computer: bit depth Each sample is a numerical value representing the
instantaneous amplitude of the signal at the moment it was sampled
The range of possible numbers used by a computer depends on the number of binary digits (bits) used to store each number
As the number of bits increases, the range of possible numbers they can express increases by a power of two
How many bits did we use per color channel per pixel? How many values could each color channel take on?
http://music.arts.uci.edu/dobrian/digitalaudio.htm
Digitizing sound with the computer
Each sample is stored as a number (two bytes=16 bits) What’s the range of available combinations?
16 bits, 216 = 65,536 But we want both positive and negative values
To indicate compressions and rarefactions. What if we use one bit to indicate positive (0) or negative (1)? That leaves us with 16 - 1 = 15 bits 15 bits, 215 = 32,768 One of those combinations will stand for zero
We’ll use a “positive” one, so that’s one less pattern for positives
+/- 32K
Each sample can be between -32,768 and 32,767
Compare this to 0..255 for light intensity
(i.e. 8 bits or 1 byte giving us 256 different values)
Why such a bizarre number?
Because 32,768 + 32,767 + 1 = 216
i.e. 16 bits, or 2 bytes< 0 > 0 0
Sounds as arrays (called lists in jython) Samples are just stored one right after the other in the
computer’s memory That’s called an array
It’s an especially efficient (quickly accessed) memory structure
(Like pixels in a picture)
Working with sounds
We’ll use pickAFile and makeSound. We want .wav files (NOTE: not all .wav files will work)
We’ll use getSamples to get all the sample objects out of a sound
We can also get the value at any index with getSampleValueAt
Can get a sound’s length (getLength(sound)) Can get a sound’s sampling rate
(getSamplingRate(sound)) Can save sounds with writeSoundTo(sound,"file.wav")
Demonstrating Working with Sound in JES
>>> filename = pickAFile()>>> print filename/Users/monica/Desktop/MediaSources/preamble.wav>>> sound = makeSound(filename)>>> print soundSound of length 421109>>> samples = getSamples(sound)>>> print samplesSamples, length 421109>>> print getSampleValueAt(sound, 1)36>>> print getSampleValueAt(sound, 2)29
Demonstrating working with samples>>> print getLength(sound)220568>>> print getSamplingRate(sound)22050.0>>> print getSampleValueAt(sound, 220568)68>>> print getSampleValueAt(sound, 220570) # note this is too highI wasn't able to do what you wanted.The error java.lang.ArrayIndexOutOfBoundsException has occuredPlease check line 0 of >>> print getSampleValueAt(sound, 1)36>>> setSampleValueAt(sound,1, 12)>>> print getSampleValueAt(sound, 1)12
Working with Samples
We can get sample objects out of a sound with getSamples(sound) or getSampleObjectAt(sound, index)
A sample object it still part of the sound object, so if you change the sample object, the sound changes.
Sample objects understand getSampleValue(sample) and setSampleValue(sample, value)
Example: Manipulating Samples>>> soundfile=pickAFile()>>> sound=makeSound(soundfile)>>> sample=getSampleObjectAt(sound, 1)>>> print sampleSample at 1 value at 59>>> print soundSound of length 387573>>> print getSound(sample)Sound of length 387573>>> print getSampleValue(sample)59>>> setSampleValue(sample, 29)>>> print getSampleValue(sample)29
“But there are thousands of these samples!” How do we do something to these samples to
manipulate them, when there are thousands of them per second?
We use a loop and get the computer to iterate in order to do something to each sample.
An example loop that just gets and reassigns the same value:
for sample in getSamples(sound): value = getSampleValue(sample) setSampleValue(sample, value)
N.B.: function name changes
setSampleValue(sample, 200) used to be called setSample(sample, 200)
getSampleValue(sample) used to be called getSample(sample)
So if you are using an old version of JES (pre-3.1) you may have to use setSample and getSample.
Properties of sound
Frequency Amplitude Wavelength
http://music.arts.uci.edu/dobrian/digitalaudio.htm
Properties of digital sound
Sampling rate/frequency – not to be confused with the frequency of the sound we are trying to capture!
Bit depth
http://music.arts.uci.edu/dobrian/digitalaudio.htm