Sampling Terminology f 0 is the fundamental frequency (Hz) of the signal –Speech: f 0 = vocal cord...

Sampling Terminology• f0 is the fundamental frequency (Hz) of the signal

– Speech: f0 = vocal cord vibration frequency (>=80Hz)

– Speech signals contain harmonics of 5 KHz or less– The upper limit of human hearing is about 22 KHz

• fs = sampling frequency in measurements per second

• ts is the sampling period (ts = 1/fs)

• x[n] xn = amplitude of the nth signal measurement (sample)

• X[f] or Xf = amplitude of the fth frequency component (bin)

• X(f) or x(n) = continuous signal frequency and time domains

Quantization

Understanding Digital Signal Processing, Third Edition, Richard Lyons(0-13-261480-4) © Pearson Education, 2011.

Measurements• Terminology

– We normally measure continuous signals at discrete times– Quantization (bits) converts measurements to numeric values– Sample rate (samples per second) = time between measurements

• Measurement error (|measured value - true value|)– Precision: quality of a single measurement

• Half of the least significant digit• Random noise eliminated by averaging many measurements

– Accuracy: quality of the result• Repeat experiment and average the results• Ex: Sample the ocean floor 1000 times to determine the depth.• Poor accuracy indicates poor repeatability or bad calibration

Quantization Errors



Quantization errors effect on frequency domain

Time domain quantization errors

Frequency domain

Analog-to-digital Converters (A-D)

• The A-D Conversion process– Converts a stream of varying air pressure into voltages– The A-D converter samples the voltage at regular intervals– Convert each sample to an integer of a fixed bit size

• Insufficient Quantization (Not enough quantization bits)– Different perceived sounds convert to the same value

• Limits of quantization (Signal-to-noise ratio (SNR))– If SNR is small, the lower bits simply measure noise

• Quantization types– Linear: output varies linearly with input– Non-linear: logarithmic conversion from input to output

Quantize continuous signals to streams of discrete integers

Linear Encoding

• Example: Values 0..255 has an 8 bit resolution. – Measurement range: -64 to 63 decibels – Voltage resolution: (27-0)/28 = ½ decibel

The output values relates linearly to the input values

Decibels

Am

plit

ude

Non-linear Encoding Algorithms

• Uses (ex: telephony – voice over distance)– Use limited bandwidth channels– Convert from 12 or 16 -> 8 bits– Logarithmic representation

• Formulae– Mu-law algorithm (North America, Japan)

• μ = 255, x normalized between [-1,1]• -1<=x<=1; y=ln(1+μx) / ln(1+μ)

– A-LAW algorithm (Europe, world wide)• A = 87.6, x normalized between [-1,1]• IF |x|<=1/A; y=Ax/(1+ln A)• IF 1/A<=|x|<=1; y=(1+lnAx)/(1+lnA)

Humans hear sounds up to 120db with a sensitivity of about 1dbTherefore 8 bit samples should be sufficient

Fixed Point Representation

• Advantages– Very fast add, multiply, and subtract operation– Shift operations for extremely fast power of two operations

• Disadvantages– Division operations convert to use floating point circuitry– Cannot represent very large or very small numbers– Represents a fixed range of numbers– Loss of precision during division– Scaling between precisions is very cumbersome

• Example– Scale values 0 to 10.5 using 64 bit integers– The value zero maps to zero.– The value 10.5 maps to 264 -1– The resolution is 10.5 / 265

The decimal point is at a fixed position

Floating Point Representation

• Represents numbers in scientific notation – Decimal: 987.654 = 9.87654 * 102

– Hexadecimal: 89a.bcd = 8.9abcd * 162

– Contains an exponent and a base (mantissa)

• Advantages– A sliding window of precision– Represents an extremely large range of numbers

• Disadvantages– Add, multiply, and subtract operations are slower– Shift operations cannot be used– Holes in the number line

A sliding decimal point according to the size of the number

Floating Point

Formats• Example: Convert 329.390625 to 32 bit floating point

1. Convert to binary: 329 -> 101001001.011001 (256+64+8+1)Fractional portion: 0110010.390626 * 2 = 0.78125 -> 00.78125 * 2 = 1.5625 -> 10.5625 * 2 = 1.125 -> 10.125 * 2 = 0.25 ->00.25 * 2 = 0.5 -> 00.5 * 2 = 1 -> 1 (no fraction left, so we are done)

2. Convert to binary scientific notation: 1.01001001011001 * 28

3. Add the exponent to the bias: 8 + 127 = 135 (10000111)4. Answer: 0 10000111 01001001011001000000000

• Special Values– Zero: Exponent = Fraction = 0– ± Infinity: sign * (Exponent all 1s, faction = 0)– NaN: Exponent all 1s and fraction ≠ 0

Sign Exponent Fraction Bias

Single 1[31] 8[23->30] 23[0-22] 127

Double 1[63] 11[52-62] 52[0-51] 1023

Recommendations forHuman Speech Processing

• Computers are not as good as the human ear• Most humans cannot hear frequencies above 20 kHz.• Human Hearing Sensitivity

– Most speech is encoded in frequencies < 8kHZ– Maximum sensitivity is approximately 22kHZ– Hearing loss occurs at the high end of the range– We are more sensitive to frequencies at the low end of the range – Sensitivity follows a logarithmic curve

• Quantization– Linear: Bits per sample >=12– Non-linear: Bits per sample = 8

• Common audio CD formats: 44.1, 22.05, and 11.025 kHz

Aliasing

• When does this occur?– Frequencies (f>N) present that are above Nyquist Frequency(fN)– If f∆ = f>N – fN, then fN+f∆ is indistinguishable from fN-f∆.

• What do we do about it?– Place an anti-aliasing filter to eliminate high frequencies– This CANNOT be done in software

• Example of aliasing - Take a picture of sun every 23 hours• 24 x 23 = 552 hours between sunrises• Sun appears to move from west to east

Different frequencies become indistinguishable


7khz appears as 1khz

4khz appears as -2khz, or 2khz 1800 out of phase

Aliasing and Filtering

Alias is out of phase in this example

Low Pass Filtered Signal


Nyquist Theorem

Nyquist Frequency (fN) = highest detectible frequencySampling Frequency (fs) = samples per time periodMaximum Signal Frequency of Interest (fmax)

Theorem: fN = 2 * fmax; fs >= fN

Inadequate Sampling Adequate Sampling

How many cycles per second do we need?

Sampling formulae

0th sample: x0 = sin(2πfX 0ts)

1st sample: x1 = sin(2πfx 1ts)

…

nth sample: xn = sin(2πfx nts)

Note: Sinewave of (fx+kfs) Hz aliases sinewaves of f0 Hz sin(2πfx nts)

= sin(2πfx nts+2πm) ; where m is any positive or negative integer= sin(2πfxnts+2πm(nts)/(nts)) ; multiply 2nd term by one

= sin(2πnts(fx+m/(nts)) ; we factor 2πnts from both terms

= sin(2πnts(fx+k/ts) ; Let k = m/n ratio

= sin(2πnts(fx+kfs) ; because fs = 1/ts

A sinusoid cycle from 0 to 3600 is equivalent to 2 π radians

Note: fx is some frequency less than the sampling rate fs


Using Carrier Frequencies

Goal: Choose optimal fs > 2B

Choosing Sampling Frequencies• Goal: Bring band of frequency between –fs and + fs

• Solution: Pick largest m such that fs >= 2*B

• Choosing fs that maintains the spectral direction

m fs = 2 (fc – B/2)

fs = (2 fc – B)/mExample: If fc=20000, B = 5 m=1,2,3,4 Then fs=35,17.5,11.25, 8.75

• Choosing fs that reverses the spectral direction(m+1) fs = 2 (fc + B/2)

fs = (2fc + B)/(m+1)

Example: If fc=20000, B = 5 m=1,2,3 Then fs=22.5,15,7.5

• Restore spectral directionx[n] = (-1)n * x[n]


SNR Degradation

Sampling Terminology f 0 is the fundamental frequency (Hz) of the signal –Speech: f 0 = vocal cord...

Documents

Transcript of Sampling Terminology f 0 is the fundamental frequency (Hz) of the signal –Speech: f 0 = vocal cord...