UNIVERSITY of MANCHESTER School of Computer Science Comp30291 Digital Media Processing 2009-10

84
8 Dec '09 Comp30291 : Section 7 1 UNIVERSITY of MANCHESTER School of Computer Science Comp30291 Digital Media Processing 2009-10 Section 7 Discrete Fourier Transform

description

UNIVERSITY of MANCHESTER School of Computer Science Comp30291 Digital Media Processing 2009-10 Section 7 Discrete Fourier Transform. Introduction. Given an analogue signal x a (t) with Fourier Transform:. Assume x a (t) is band-limited between 0 & F s/2. - PowerPoint PPT Presentation

Transcript of UNIVERSITY of MANCHESTER School of Computer Science Comp30291 Digital Media Processing 2009-10

Page 1: UNIVERSITY of MANCHESTER School of Computer Science Comp30291 Digital Media Processing 2009-10

8 Dec '09 Comp30291 : Section 7 1

UNIVERSITY of MANCHESTERSchool of Computer Science

Comp30291Digital Media Processing 2009-10

Section 7

Discrete Fourier Transform

Page 2: UNIVERSITY of MANCHESTER School of Computer Science Comp30291 Digital Media Processing 2009-10

8 Dec '09 Comp30291 : Section 7 2

Introduction• Given an analogue signal xa(t) with Fourier Transform:

dtetxjX tjaa

)()(

• Assume xa(t) is band-limited between 0 & Fs/2.

• If xa(t) is sampled at Fs = 1/T to give {x[n]} with DTFT:

n

jnj enxeX ][)(

TjXT

eX aj with -for )(

1)(

it may be shown that:

DTFT is convenient way of calculating Xa(j) by computer.

Page 3: UNIVERSITY of MANCHESTER School of Computer Science Comp30291 Digital Media Processing 2009-10

8 Dec '09 Comp30291 : Section 7 3

Two difficulties

(i) Infinite range of summation (ii) X(e j ) is continuous function of

Solutions:

(i) Time-domain windowing: Restrict {x[n]} to { x[0], x[1], … x[N-1]} {x[n]} 0, N-1

(ii) Frequency-domain sampling: Store samples of X(ej)

Page 4: UNIVERSITY of MANCHESTER School of Computer Science Comp30291 Digital Media Processing 2009-10

8 Dec '09 Comp30291 : Section 7 4

Sampling X(ej)

• Storing M equally spaced samples of X(ej) over the range = - to would be possible.

• For real signals, only the range 0 to is of interest.

• In practice, the range is made 0 to 2 rather than - to .

• So we generally disregard the range from to 2.

Page 5: UNIVERSITY of MANCHESTER School of Computer Science Comp30291 Digital Media Processing 2009-10

8 Dec '09 Comp30291 : Section 7 5

{ X[k] } 0, M-1 { X[0], X[1],…, X[M-1] }

|X(ej)|

|X[1]| |X[3]|

|X[0]| |X[M-1]|

M-1

With M equally spaced samples over 0 <

2

MkeXkX kj k /2 with )(][ where

Page 6: UNIVERSITY of MANCHESTER School of Computer Science Comp30291 Digital Media Processing 2009-10

8 Dec '09 Comp30291 : Section 7 6

• For spectral analysis, the larger M, the better for drawing accurate graphs etc.

• But, if we need minimum M for storage of unambiguous spectrum, take M=N.

Page 7: UNIVERSITY of MANCHESTER School of Computer Science Comp30291 Digital Media Processing 2009-10

8 Dec '09 Comp30291 : Section 7 7

Discrete Fourier Transform (DFT)

• Transforms: {x[n]} 0, N-1 to {X[k]} 0, N-1

(complex) (complex)

NkenxkX k

N

n

nj k /2 where ][ ][1

0

• DFT transforms a finite sequence to another finite sequence.

• DTFT transforms infinite sequence to continuous functn of

Page 8: UNIVERSITY of MANCHESTER School of Computer Science Comp30291 Digital Media Processing 2009-10

8 Dec '09 Comp30291 : Section 7 8

Inverse DFT: {X[k]}0, N-1 {x[n]}0, N-1

1

0

][1

][N

k

nj kekXN

nx

Note Similarity with DFT:

1

0

][ ][N

n

nj kenxkX

Inverse DFT

Page 9: UNIVERSITY of MANCHESTER School of Computer Science Comp30291 Digital Media Processing 2009-10

8 Dec '09 Comp30291 : Section 7 9

Programming the DFT & its inverse:

1

0

][ ][N

n

nj kenxkX

1

0

][1

][N

k

nj kekXN

nx

k = 2k/N

• Similarity exploited by programs able to perform DFT or its inverse using same code.

• Programs to implement these equations in a ‘direct’ manner given in MATLAB (using complex arith) & C (using real arith only).

• These ‘direct’ programs are very slow & FFT is much faster.

Page 10: UNIVERSITY of MANCHESTER School of Computer Science Comp30291 Digital Media Processing 2009-10

8 Dec '09 Comp30291 : Section 7 10

% Given N complex samples in array x[1:N]if (Invers == 1) E = -2*pi/N else E = 2*pi/N ;for k=0 : N-1 X(1+k) = 0 + j*0 ; Wk =k*E ; for L = 0 : N-1 C = cos(L*Wk) + j *sin(L*Wk); X(1+k) = X(1+k) + x(1+L) * C; end;if (Inverse == 1) X(1+k) = X(1+k)/N ;end;% Now have N complex samples in array X[1:N]

Direct forward/inverse DFT using complex arith in MATLAB

Page 11: UNIVERSITY of MANCHESTER School of Computer Science Comp30291 Digital Media Processing 2009-10

8 Dec '09 Comp30291 : Section 7 11

void directdft(void) // DFT or Inverse DFT by direct method.{ // Order=N, Real & imag parts of input in arrays xr & xi // Output:- Real part in array X, Imag part in Y // Invers is 0 for DFT, 1 for IDFT int k, L; float c,e,s,wk; if(Invers==1) e = -2.0*PI/(float)N; else e = 2.0*PI/(float)N; for(k=0;k<N;k++) { x[k]=0.0; y[k]=0.0 ; wk=(float)k*e ; for(L=0; L<N; L++) { c=cos(L*wk); s=sin(L*wk); x[k] = x[k] + xr[L]*c + xi[L]*s; y[k] = y[k] + xi[L]*c - xr[L]*s;} if(Invers==1) { x[k]=x[k]/(float)N; y[k]=y[k]/(float)N;} }}

Direct fwd/inverse DFT using real arith only in ‘C’

Page 12: UNIVERSITY of MANCHESTER School of Computer Science Comp30291 Digital Media Processing 2009-10

8 Dec '09 Comp30291 : Section 7 12

Fast Fourier Transform (FFT)

• An FFT algorithm in ‘C ‘ is given on next slide.

• Gives ‘exactly’ same results as DFT only much faster.

• Its detail & how speed is achieved is outside our syllabus.

• Works best when N is a power of 2, e.g. 32, 64, 512, 1024, etc.

• We are interested in how to use DFT & interpret its results.

• MATLAB has efficient ‘fft’ procedure in its ‘SP tool-box’.

• Don’t need to know how it’s programmed, only how to use it!

•‘C’ version of FFT for reference on next slide.

• Direct DFT programs of academic interest only.

Page 13: UNIVERSITY of MANCHESTER School of Computer Science Comp30291 Digital Media Processing 2009-10

8 Dec '09 Comp30291 : Section 7 13

void fft(void) { int N1,N2,NM,N2M,i,j,k,l; float e,c,s,a,xt,yt;

N2=N ; NM=N-1;

if(Invers==1) e= -PI/(float)N; else e= PI/(float)N;

for ( k=2; k<N+1; k=k+k)

{ N1=N2; N2=N2/2; N2M=N2-1 ; e=e*2.0; a=0.0;

for (j=0; j < N2M+1; j++)

{ c=cos(a); s=sin(a); a=a+e;

for ( i=j; i<NM+1; i=i+N1)

{ l=i+N2; xt=x[i]-x[l]; x[i]=x[i]+x[l];

yt=y[i]-y[l]; y[i]=y[i]+y[l]; x[l]=xt*c+yt*s; y[l]=yt*c-xt*s;}

} //end of j loop

} //end of k loop

if(Invers==1) for (k=0; k<(NM+1); k++) {x[k]=x[k]/(float)N; y[k]=y[k]/(float)N ;}

j=0;

for(i=0; i<N-1; i++)

{ if (i<j) {xt=x[j]; x[j]=x[i]; x[i]=xt; yt=y[j]; y[j]=y[i]; y[i]=yt;}

k=N/2; while((k-1)<j) { j=j-k; k=k/2;} j=j+k; } // end of i loop

} // end of procedure FFT

Page 14: UNIVERSITY of MANCHESTER School of Computer Science Comp30291 Digital Media Processing 2009-10

8 Dec '09 Comp30291 : Section 7 14

DFT (FFT) for spectral analysis

• Given {x[n]}0,N-1 we get {X[k]}0,N-1

• N complex spectral samples from 0 to fs.

• When {x[n]} is real, plot magnitudes of X[k] for k=0 to N/2.

• When N=512, k=256 corresponds to fs/2.

• MATLAB arrays cannot start from zero, so x[n] is stored as x(1+n)

• Similarly {X[k]}0,N-1 are stored as X(1) ... X(N).

• We analyse some sinusoids on next slide.

Page 15: UNIVERSITY of MANCHESTER School of Computer Science Comp30291 Digital Media Processing 2009-10

8 Dec '09 Comp30291 : Section 7 15

for n=0:63 x(1+n)=100*cos(0.25*pi*n) + 100*sin(0.5*pi*n);end;X=fft(x);plot(abs(X(1:32))); grid on;

0 5 10 15 20 25 30 350

500

1000

1500

2000

2500

3000

3500

There are 64 pts but we only plot 32. The horiz axis goes from 1 to 32 for frequencies 0 to fs/2 ().

The vertical axis can tell us the amplitudes of the sinusoids.

Divide by N/2 i.e. 32 here.

It can’t tell whether they are sines or cos as we plot only magnitude.

FFT analysis of sinusoids

Page 16: UNIVERSITY of MANCHESTER School of Computer Science Comp30291 Digital Media Processing 2009-10

8 Dec '09 Comp30291 : Section 7 16

0 5 10 15 20 25 30 350

20

40

60

80

100

120

N=64;for n=0:N-1 x(1+n) = 100*cos(0.25*pi*n)+ 100*sin(0.5*pi*n);end;X=fft(x)/(N/2);plot(abs(X(1:N/2)),'*-'); grid on;

Scaled FFT analysis of sinusoids

Now it seems we can read the amplitudes directly from the graph.

Page 17: UNIVERSITY of MANCHESTER School of Computer Science Comp30291 Digital Media Processing 2009-10

8 Dec '09 Comp30291 : Section 7 17

clear all; N=128;for n=0:N-1 x(1+n) = 50*cos(0.25*pi*n)+ 100*sin(0.5*pi*n);end;X=fft(x)/(N/2);plot(abs(X(1:N/2)),'*-'); grid on;

Increased order FFT analysis of sinusoids

Now have more pts & we can read amplitudes & frequencies more accurately.

0 10 20 30 40 50 60 700

10

20

30

40

50

60

70

80

90

100

Page 18: UNIVERSITY of MANCHESTER School of Computer Science Comp30291 Digital Media Processing 2009-10

8 Dec '09 Comp30291 : Section 7 18

0 5 10 15 20 25 30 350

20

40

60

80

100

120

clear all; N=64;for n=0:N-1 x(1+n) = 100*cos((pi/4)*n) + 100*sin(1.522*n);end;X=fft(x)/(N/2);plot(abs(X(1:N/2)),'*-'); grid on;

Trouble!!

Changed frequency of 2nd sine-wave from /4 to 15.5/32=1.522.

Now we don’t get the correct amplitude as 1.522 does not correspond to one of our frequency sampling pts.

35% error in amplitude reading.

Any suggestions?.

Page 19: UNIVERSITY of MANCHESTER School of Computer Science Comp30291 Digital Media Processing 2009-10

8 Dec '09 Comp30291 : Section 7 19

clear all; N=128;for n=0:N-1 x(1+n) = 100*cos((pi/4)*n) + 100*sin(1.522*n);end;X=fft(x)/(N/2);plot(abs(X(1:N/2)),'*-'); grid on;

Increase N to 128

This works as 15.5/32 = 31/64

But it may not be an option.

We may only have 64 samples available as signal may be changing rapidly.

Any other suggestions?.

0 10 20 30 40 50 60 700

20

40

60

80

100

120

Page 20: UNIVERSITY of MANCHESTER School of Computer Science Comp30291 Digital Media Processing 2009-10

8 Dec '09 Comp30291 : Section 7 20

clear all; N=64;for n=0:N-1 x(1+n) = 100*cos((pi/4)*n) + 100*sin(1.522*n);end;for n = N:2*N-1, x(1+n)=0; end; % Length of x is now 128X=fft(x)/(N/2);figure(1); plot(abs(X(1:N)),'*-'); grid on;

Help from ‘zero-padding’

Double length of ‘x’ by ‘zero-padding’. This doubles the no. of freq sampling pts & 1.522 now hits one of these.

We are seeing the same graph with extra pts inserted by interpolation.

The ‘ripples’ were there before but we did not sample them.

0 10 20 30 40 50 60 700

20

40

60

80

100

120

Page 21: UNIVERSITY of MANCHESTER School of Computer Science Comp30291 Digital Media Processing 2009-10

8 Dec '09 Comp30291 : Section 7 21

Zero-padding

• Take a simple example with N=4 and zero-pad to N=8::

• {7 1 2 4} becomes {7 1 2 4 0 0 0 0}

or, if you wish, {0 0 0 0 7 1 2 4}

• Simple as that.

• Now have twice as many samples

• So we get twice as many points in freq-domain .

• Can interpret FFT spectral graph more clearly.

• (Note that scaling in ‘X=fft(x)/(N/2)’ has not changed)

Page 22: UNIVERSITY of MANCHESTER School of Computer Science Comp30291 Digital Media Processing 2009-10

8 Dec '09 Comp30291 : Section 7 22

Extend by symmetry

• Another way of increasing no. of freq-domain points without increasing no. of time-domain samples is to extend by even symmetry:

{6 1 2 4} becomes {6 1 2 4 4 2 1 6 }

or {4 2 1 6 6 1 2 4 }

• Both are OK but we normally take the second.

Page 23: UNIVERSITY of MANCHESTER School of Computer Science Comp30291 Digital Media Processing 2009-10

8 Dec '09 Comp30291 : Section 7 23

clear all; N=64;for n=0:N-1 x(1+n) = 100*cos((pi/4)*n) + 100*sin(1.522*n);end;for n = 0 : N-1 y(1+N+n)=x(1+n); y(N-n)=x(1+n); end;Y=fft(y) / N;plot(abs((Y(1:N))),'*-'); grid on;

Extend by ‘even symmetry’ (2nd way)

Doubles length of ‘x’ by ‘even symmetry’.

Doubles no. of freq sampling pts

About 30% error in amplitude of higher freq sin-wave.

0 10 20 30 40 50 60 700

10

20

30

40

50

60

70

80

90

100

Page 24: UNIVERSITY of MANCHESTER School of Computer Science Comp30291 Digital Media Processing 2009-10

8 Dec '09 Comp30291 : Section 7 24

Extending by even symmetry (cont)

= e-3.5jk{4e3.5jk +2e2.5jk + e1.5jk + 6e0.5jk + 6e-0.5jk + e-1.5jk + 2e-2.5jk + 4e-3.5jk }

= 2e-3.5jk{4cos(3.5k) + 2cos(2.5k) + cos(1.5k) + 6cos(0.5k)}

= 2e-3.5jk DCT[k] where

DCT[k] = 6cos(0.5k) + cos(1.5k) + 2cos(2.5k) + 4cos(3.5k)

8/2 where,7210for

426624][ 765432

k, ..., , , k

eeeeeeekX

k

jjjjjjj kkkkkkk

{6 1 2 4} becomes { 4 2 1 6 6 1 2 4 }DFT of the extended sequence is:

Page 25: UNIVERSITY of MANCHESTER School of Computer Science Comp30291 Digital Media Processing 2009-10

8 Dec '09 Comp30291 : Section 7 25

Discrete Cosine Transform (DCT)

• The DCT has many forms: we consider one

• Given {x[n]}0,N-1 its DCT is

1

0

/ where2

12cos][ ][

N

nkk Nk

nnxkDCT

• Note that k = 2k/(2N)

• Means that freq range 0 to 2 is sampled at 2N points.

• If {x[n]}0,N-1 is real, we only need DCT[k] for k = 0 to N-1.

• Takes us from 0 to ( fS/2) in N steps.

• (FFT of same order gives only N/2 pts in range 0 to fS/2).

Page 26: UNIVERSITY of MANCHESTER School of Computer Science Comp30291 Digital Media Processing 2009-10

8 Dec '09 Comp30291 : Section 7 26

DCT (Cont)• Given {x[n]}0,N-1 the DFT (FFT) of this signal extended

symmetrically (front-wise) is:

12,...,1,0for ][][ 2/)12( NkkDCTekX Nj k

For real signal, we might only go from k=0 to k=N-1, and we may evaluate only the modulus or modulus squared. Then,

][][ kDCTkX

DCT gives us modulus spectrum or ESD without complex numbers.

Page 27: UNIVERSITY of MANCHESTER School of Computer Science Comp30291 Digital Media Processing 2009-10

8 Dec '09 Comp30291 : Section 7 27

Evaluating the DCT

Direct implementation summing cosines will be slow.

Pre-extending & performing an FFT may be faster

There are faster algorithms.

MATLAB has

X=dct(x);

Also inverse:

x= dct(X);

Page 28: UNIVERSITY of MANCHESTER School of Computer Science Comp30291 Digital Media Processing 2009-10

8 Dec '09 Comp30291 : Section 7 28

2-Dimensional DCT

• DCT is widely used in sound & image processing.

• For the latter, a 2D version is needed.

Mm

Nkn

mnxkXN

k

M

/2

12cos/

2

12cos],[],[

1

0

1

0

• Performed by ‘X=dct2(x)’ in MATLAB

• Inverse is available ‘x = idct2(X);’

• More about the 2-d DCT later.

Page 29: UNIVERSITY of MANCHESTER School of Computer Science Comp30291 Digital Media Processing 2009-10

8 Dec '09 Comp30291 : Section 7 29

clear all; N=64;for n = 0:N-1 x(1+n) = 100*cos((pi/4)*n) + 100*sin(1.522*n); end; for k=0:N-1 X(k+1)=0; Wk = -k*pi/(2*N); for n=0:N-1 X(k+1) = X(k+1) + x(n+1)*cos((2*n+1)*Wk); end; end; plot(abs(X/(N/2)),'*-' ); grid on;

1-dimensional DCT by computed direct method

Gives exactly same graph as for ‘even symmetry’ FFT method.

This is slower. .

But no complex numbers!

0 10 20 30 40 50 60 700

10

20

30

40

50

60

70

80

90

100

Page 30: UNIVERSITY of MANCHESTER School of Computer Science Comp30291 Digital Media Processing 2009-10

8 Dec '09 Comp30291 : Section 7 30

clear all; N=64;for n = 0:N-1 x(1+n) = 100*cos((pi/4)*n) + 100*sin(1.522*n); end; X = dct(x); plot(abs(X/(sqrt(N/2)),'*-' ); grid on;

1-d DCT by MATLAB

Gives same graph, but scaling is different. See MATLAB ‘help’ & documentation.

X[k] scaled by [2/N] when k>1 &[2/N] when k=1.

A small complication!

0 10 20 30 40 50 60 700

10

20

30

40

50

60

70

80

90

100

Page 31: UNIVERSITY of MANCHESTER School of Computer Science Comp30291 Digital Media Processing 2009-10

8 Dec '09 Comp30291 : Section 7 31

Applications of DFT (FFT) & DCT

• FFT is ‘Swiss army knife' of signal processing. • Most ‘spectrum analysers’ use an FFT algorithm. • Some applications of spectral estimation are:

» determining frequency & loudness of a musical note, » deciding whether a signal is periodic or non-periodic,» analysing speech to recognise spoken words,» looking for sine-waves buried in noise,» measuring frequency distribution of power or energy.

Page 32: UNIVERSITY of MANCHESTER School of Computer Science Comp30291 Digital Media Processing 2009-10

8 Dec '09 Comp30291 : Section 7 32

Other applications

• Many other uses in signal processing, e.g. filtering.• To filter out a range of frequencies,

–perform DFT or DCT, –set the unwanted frequency samples to zero –perform inverse DFT or DCT.

• DCT used in MP3 compression to remove components we can’t hear anyway, to save bits.

• Illustrate by using the DCT & IDCT to remove the higher frequency sine-wave.

Page 33: UNIVERSITY of MANCHESTER School of Computer Science Comp30291 Digital Media Processing 2009-10

8 Dec '09 Comp30291 : Section 7 33

DCT to filter out sine-wave

clear all; N=64;

for n=0:N-1 x(1+n)=100*cos((pi/4)*n)+100*sin(1.522*n); end;

X = dct(x); for k=25:40 X(k) = 0; end;

y=idct(X); plot(y); grid on;

0 10 20 30 40 50 60 70-150

-100

-50

0

50

100

150

There’s no sign of the higher frequency sine-wave, but the ‘edge’ effects at the beginning & end need to be improved.

As we are using ‘dct’ & corresponding ‘idct’ we don’t have to worry abt the funny scaling.

Page 34: UNIVERSITY of MANCHESTER School of Computer Science Comp30291 Digital Media Processing 2009-10

8 Dec '09 Comp30291 : Section 7 34

Use of FFT to spectrally analyse music

Next slide (Table 4 in notes) is MATLAB program which reads music from a 'wav' file, splits it up into 512 sample sections & performs a DFT (by FFT) analysis on each section.

Page 35: UNIVERSITY of MANCHESTER School of Computer Science Comp30291 Digital Media Processing 2009-10

8 Dec '09 Comp30291 : Section 7 35

% Table 4: MATLAB program (dmp7t4.m) to analyse a music fileclear all; N = 512 ; % DFT order;[music, fs, nbits] = wavread('cap4th.wav'); L=length(music);for frame = 10 : 20 for n =0 : N-1 x(1+n) = music( 1+ (frame-1)*N + n) ; end; figure (1); plot( x); grid on; X=fft(x); figure(2); plot( abs(X(1:N/2)) ); grid on; SOUND([x x x],fs,nbits); % Play it 3 times. pause; % Press return for next frameend;

Page 36: UNIVERSITY of MANCHESTER School of Computer Science Comp30291 Digital Media Processing 2009-10

8 Dec '09 Comp30291 : Section 7 36

Musical (violin) note & its mag-spectrum

0 100 200 300 400 500 600-0.5

-0.4

-0.3

-0.2

-0.1

0

0.1

0.2

0.3

0.4

0 50 100 150 200 250 3000

5

10

15

20

25

30

35

Time-domain: (512 samples)

Magnitude spectrum shows fundamental & harmonics

Page 37: UNIVERSITY of MANCHESTER School of Computer Science Comp30291 Digital Media Processing 2009-10

8 Dec '09 Comp30291 : Section 7 37

Spectral analysis of speech

• File: OPERATOR.pcm contains sampled speech.• SNR-12dB.pcm contains sine-wave corrupted with noise.• Sampled at 8 kHz using 12-bit A/D converter. • May be read into "MATLAB" program in Table 5 (next slide) & spectrally analysed using the FFT. • Meaningless to analyse a large speech file at once. • Divide into blocks of samples & analyse separately.• Blocks of N (= 512) samples may be read in and displayed.• Programs in notes available on: ~barry/mydocs/Comp30291

Page 38: UNIVERSITY of MANCHESTER School of Computer Science Comp30291 Digital Media Processing 2009-10

8 Dec '09 Comp30291 : Section 7 38

%MATLAB program (dmp7t5.m)for spectrally analysing speech.

clear all; N = 512 ; % DFT order

IFid=fopen('operator.pcm','rb'); speech = fread(IFid, 'int16');

H=hann(N); % samples of a Hann window of order N

for frame = 25 : 200

for n =0 : N-1

x(1+n) = speech( 1+ (frame-1)*N + n);

winx(1+n) = x(1+n)*H(1+n) ;

end; figure(1); plot(x);

X=fft(winx);

figure(2); plot( 20*log10(abs(X(1:N/2) )) ); ylim([40 100]);

SOUND(x/4000,8000,16); % listen to the frame

pause; % Press return for next frame

end; fclose('all');

Page 39: UNIVERSITY of MANCHESTER School of Computer Science Comp30291 Digital Media Processing 2009-10

8 Dec '09 Comp30291 : Section 7 39

Spectrum of a segment of voiced male speech

0 50 100 150 200 250 30040

50

60

70

80

90

100

0 100 200 300 400 500 600-1500

-1000

-500

0

500

1000

1500

Time-domain(Volts against sample no.

Freq-domain (dB against freq point)

Page 40: UNIVERSITY of MANCHESTER School of Computer Science Comp30291 Digital Media Processing 2009-10

8 Dec '09 Comp30291 : Section 7 40

Spectrum of another segment of voiced male speech

Time-domain(Volts against sample no.

Freq-domain (dB against freq point)

0 100 200 300 400 500 600-1200

-1000

-800

-600

-400

-200

0

200

400

600

800

0 50 100 150 200 250 30040

50

60

70

80

90

100

formants

Page 41: UNIVERSITY of MANCHESTER School of Computer Science Comp30291 Digital Media Processing 2009-10

8 Dec '09 Comp30291 : Section 7 41

Spectrum of a segment of voiced female speech

Time-domain(Volts against sample no.

Freq-domain (dB against freq point)

0 100 200 300 400 500 600-800

-600

-400

-200

0

200

400

600

0 50 100 150 200 250 30040

50

60

70

80

90

100

Page 42: UNIVERSITY of MANCHESTER School of Computer Science Comp30291 Digital Media Processing 2009-10

8 Dec '09 Comp30291 : Section 7 42

Comments on speech graphs

• Periodicity seen in time-domain for voiced speech (vowels)

• Mag-spectrum has fundamental & many harmonics

• Can measure fundamental to determine pitch of the voice.

• Male has lower fundamental than female speech

• Unvoiced speech (consonants) has no or less periodicity.

• ‘Formants’ seen as peaks in spectral envelope.

• Caused by vocal tract resonance.

• Can determine vowel sound (a e i o u etc) from formants.

• In principle, can do speech recognition this way.

• Bit-rate compression based on same understanding.

Page 43: UNIVERSITY of MANCHESTER School of Computer Science Comp30291 Digital Media Processing 2009-10

8 Dec '09 Comp30291 : Section 7 43

Apply 2-d DCT to a picture

clear all;

rgbpic = imread('autumn.tif'); imview(rgbpic); % input image & display

bwpic = rgb2gray(rgbpic); % Convert to gray scale

imview(bwpic); imwrite(bwpic,'bwaut.tif','tif'); % Display & store image

BWspectrum = dct2(bwpic); % Apply dct

figure(1); imshow(log(abs(BWspectrum)),[]), colormap(jet), colorbar;

BWspectrum(abs(BWspectrum)<10) = 0.001; % Make zero if <10

figure(2); imshow(log(abs(BWspectrum)),[]), colormap(jet), colorbar;

reconpic = idct2(BWspectrum); % Apply inverse dct

imview(reconpic,[0 255]); imwrite(bwpic,'bwreconaut.tif','tif'); % Display & store

Page 44: UNIVERSITY of MANCHESTER School of Computer Science Comp30291 Digital Media Processing 2009-10

8 Dec '09 Comp30291 : Section 7 44

Original picture ‘autumn.tif’

Page 45: UNIVERSITY of MANCHESTER School of Computer Science Comp30291 Digital Media Processing 2009-10

8 Dec '09 Comp30291 : Section 7 45

-5

0

5

10

-5

0

5

10

Original & reconstructed images with DCT spectra

Original Reconstructed

Page 46: UNIVERSITY of MANCHESTER School of Computer Science Comp30291 Digital Media Processing 2009-10

8 Dec '09 Comp30291 : Section 7 46

Comments on pictures

• Previous program takes a coloured picture & converts it to gray scale.

• Then takes DCT & plots 3-D mag-spectrum with colour scale.

• Notice a concentration of energy in top corner.

• We set to zero any energy values <10.

• Useful for image compression as we don’t have to code these.

• Then go back to an image via an inverse dct.

• We can see the reconstructed image

& its modified spectrum (with lots of blue).

• Any perceivable loss of quality?

Page 47: UNIVERSITY of MANCHESTER School of Computer Science Comp30291 Digital Media Processing 2009-10

8 Dec '09 Comp30291 : Section 7 47

Windowing

• Given {x[n]}0,N-1 {x[0], x[1], ..., x[N-1] }• Assumed obtained by multiplying infinite sequence {x[n]} by non-symmetric ‘rectangular window’ {r[n]}

otherwise : 0

0 : 1

Nnnr

r[n]

0 N

n

Page 48: UNIVERSITY of MANCHESTER School of Computer Science Comp30291 Digital Media Processing 2009-10

8 Dec '09 Comp30291 : Section 7 48

Windowing: symmetric & non-symmetric•Defn of rect window changed from that used in Section 4. Now, non-symmetric window: In Section 4, symmetric window:

otherwise : 0

0 : 1

:Now

Nnnr

r[n]

0 N

n

otherwise : 0

22 : 1

: 4Section In

1

MnMnrM

rM+1[n]

-M/2 M/2

n

Has N non-zero samples Has M+1 non-zero samples

Page 49: UNIVERSITY of MANCHESTER School of Computer Science Comp30291 Digital Media Processing 2009-10

8 Dec '09 Comp30291 : Section 7 49

DTFT of non-symmetric rectangular window

... ,2 0,hen w

,...4,2 ,0 when )2/sin(

)2/sin( 2/)1(

N

Ne Nj

1

0

1

0

][)(N

n

njN

n

njj eenreR

(Details of calculation omitted)

Page 50: UNIVERSITY of MANCHESTER School of Computer Science Comp30291 Digital Media Processing 2009-10

8 Dec '09 Comp30291 : Section 7 50

DTFT of non-symmetric rect window

otherwise : )2/sin(

)2/sin(,...2 ,0 :N

|)(|

NeR j

DTFT is now not purely real because of e-j(N-1)/2 term.

Mostly interested in magnitude (modulus) which is:

Page 51: UNIVERSITY of MANCHESTER School of Computer Science Comp30291 Digital Media Processing 2009-10

8 Dec '09 Comp30291 : Section 7 51

When N is even, DTFT, R(ej) of {r[n]} has magnitude:

-

0

2

4

6

8

10

-5 -4 -3 -2 -1 0 1 2 3 4 5 6 7 rads/sample

N=10

Page 52: UNIVERSITY of MANCHESTER School of Computer Science Comp30291 Digital Media Processing 2009-10

8 Dec '09 Comp30291 : Section 7 52

• Magnitude of R(ej ) shown on previous slide when N=10.

• Note relatively narrow main lobe & side-lobes.

• Zero-crossings occur at = 2 / N , 4 / N, etc.

• Like modulus of a ‘sinc’ function in many ways.

• Rectangular windows caused stop-band ripples with FIR filters.

• They cause problems with FFT graphs as well.

• 11th order non-sym rect window would have modulus identical to |R11(ej)| for symmetric rect window, but different argument.

‘Sinc-like’ spectrum of non-sym rectangular window

Page 53: UNIVERSITY of MANCHESTER School of Computer Science Comp30291 Digital Media Processing 2009-10

8 Dec '09 Comp30291 : Section 7 53

-

-0 0

X(ej)DTFT of {cos(n)}

Spectral analysis of a sampled sine-wave

Consider DTFT of {cos(0n)} which exists for all time.All its energy is concentrated at 0 & it has infinite energy. Magnitude spectrum has impulses at 0 as shown below:

Page 54: UNIVERSITY of MANCHESTER School of Computer Science Comp30291 Digital Media Processing 2009-10

8 Dec '09 Comp30291 : Section 7 54

DTFT of rectangularly windowed sampled sine-wave

• DTFT of {cos0n}0,N-1 = DTFT of {x[n].r[n]}

0 -

-0

X(e j )

R(e j( )

-5

5

10

15

20

-4 -3

0

-2 -1 0 3 4

R(e

xp(j

w))

1 2

These 2 spectra get convolved

deR j )()()2/(1 )(

0

Page 55: UNIVERSITY of MANCHESTER School of Computer Science Comp30291 Digital Media Processing 2009-10

8 Dec '09 Comp30291 : Section 7 55

0 -

-0

X(e j )

R(e j )

sincs20(w/2)

-5

10

15

20

-4 -3 -2

0

5

-1 0 1 42 3

w rads/sample

R(e

xp

(jw

))

These 2 spectra get combined (convolved)

Page 56: UNIVERSITY of MANCHESTER School of Computer Science Comp30291 Digital Media Processing 2009-10

8 Dec '09 Comp30291 : Section 7 56

-2

0

2

4

6

8

10

-4 -3 -2 -1 0 1 2 3 4radians/sample 0

- 0

P(e j )

Result: DTFT of windowed sine-wave {p[n]} N=20

Page 57: UNIVERSITY of MANCHESTER School of Computer Science Comp30291 Digital Media Processing 2009-10

8 Dec '09 Comp30291 : Section 7 57

0

12

3

45

6

7

89

10

-4 -3 -2 -1 0 1 2 3 4

rad/si|P

(

)| | P(e j ) |

0 0

Magnitude of DTFT of windowed sine-wave {p[n]}

Page 58: UNIVERSITY of MANCHESTER School of Computer Science Comp30291 Digital Media Processing 2009-10

8 Dec '09 Comp30291 : Section 7 58

• When rectangular window is of width N samples:

• Ampl of main peak = ampl of sine-wave*N/2 check!

• If we divide DTFT by N/2 we get true amplitude.

• N-1 zero-crossings between 0 and .

• 1 main peak & N-2 ripples in magnitude spectrum.

• Increasing N gives sharper peak & more ripples.

Comments on this graph

Page 59: UNIVERSITY of MANCHESTER School of Computer Science Comp30291 Digital Media Processing 2009-10

8 Dec '09 Comp30291 : Section 7 59

Effect of rectangular windowing on DFT :

• Effect on DTFT is frequency spreading & side lobes.• DFT is obtained by sampling the DTFT in freq-domain.• How does windowing with frequency sampling affect the DFT?• Effect is rather confusing.• Consider the analysis of cos(0n) two cases :

(1) where 0 lies between two freq sampling pts (2) where 0 lies exactly on a freq sampling pt

(Amplitude of cos is 1, but I have not divided DTFT by N/2.So DTFT peak value is N/2 = 10 rather than 1.)

Page 60: UNIVERSITY of MANCHESTER School of Computer Science Comp30291 Digital Media Processing 2009-10

8 Dec '09 Comp30291 : Section 7 60

Sampled DTFT of windowed sine-wave (case 1)

0

12

3

45

6

7

89

10

-4 -3 -2 -1 0 1 2 3 4

rad/si|P

(

)| | P(e j ) |

0 0

Page 61: UNIVERSITY of MANCHESTER School of Computer Science Comp30291 Digital Media Processing 2009-10

8 Dec '09 Comp30291 : Section 7 61

4

Sampled DTFT of windowed sine-wave (case 2)

0

12

3

45

6

7

89

10

-4 -3 -2 -1 0 1 2 3

rad/si|P

(

)| | P(e j ) |

0 0

Page 62: UNIVERSITY of MANCHESTER School of Computer Science Comp30291 Digital Media Processing 2009-10

8 Dec '09 Comp30291 : Section 7 62

Effect of rectangular windowing on DFT :

Given a sine-wave of arbitrary frequency, we don’t know whether or not it will line up with one of the freq sampling pts.And the amplitude we see will be reduced by up to 35% depending on whether it lines up or not.

Example (Case 1)• Consider 64 pt DFT of { cos(0.7363n)} in Fig 1.• 0 = 0.7363 lies between 7 = (2/64)*7 & 8 = (2/64)*8• Samples of rectangular window seen.• X[7] and X[8] strongly affected by sinusoid.

Page 63: UNIVERSITY of MANCHESTER School of Computer Science Comp30291 Digital Media Processing 2009-10

8 Dec '09 Comp30291 : Section 7 63

0 5 10 16 20 30 32k

5

10

15

|X(k)|

True amplitude =1Measured 20/32=0.625 (35% error)

Fig 1 : Magnitude of 64 pt DFT spectrum of cos(0.7363n)

Page 64: UNIVERSITY of MANCHESTER School of Computer Science Comp30291 Digital Media Processing 2009-10

8 Dec '09 Comp30291 : Section 7 64

• Now consider 64 pt DFT of { cos (n/4) } in Fig 20 = /4 coincides exactly with 8. • Only X[8] affected.

Example of Case 2

Page 65: UNIVERSITY of MANCHESTER School of Computer Science Comp30291 Digital Media Processing 2009-10

8 Dec '09 Comp30291 : Section 7 65

0 8 16 32k

32

|X(k)|

Ampl. = 32/(N/2) = 32/32 = 1No error.

Fig2 : Magnitude of 64 point DFT of cos(n/4).

Page 66: UNIVERSITY of MANCHESTER School of Computer Science Comp30291 Digital Media Processing 2009-10

8 Dec '09 Comp30291 : Section 7 66

What’s happened to the ‘sinc-like’ function in Fig 2 ?

• We don’t see it because all the samples apart from one occur at the ‘zero-crossings of the ‘sinc-like function’.

• This always occurs when the frequency of sine-wave coincides exactly with a frequency sampling point.

Page 67: UNIVERSITY of MANCHESTER School of Computer Science Comp30291 Digital Media Processing 2009-10

8 Dec '09 Comp30291 : Section 7 67

Difference between Figs 1 & 2 undesirable.Use non-rectangular windows e.g. Hann {w[n]}0,N-1 with

otherwise

Nnnw

N<n 0 :

0

)/)12cos((5.05.0][

Hann window

0

0.2

0.4

0.6

0.8

1

1.2

0 5 10 15 20 25

n

w[n

]

(Slightly different definition from the one we had for FIR filters).

Non-rectangular windowing

Page 68: UNIVERSITY of MANCHESTER School of Computer Science Comp30291 Digital Media Processing 2009-10

8 Dec '09 Comp30291 : Section 7 68

Applying the window

• Given signal block {x[n]}0,N-1 & a Hann window {w[n]}0,N-1

• To apply the window just multiply the two to obtain:

{x[n].w[n]}0,N-1

Page 69: UNIVERSITY of MANCHESTER School of Computer Science Comp30291 Digital Media Processing 2009-10

8 Dec '09 Comp30291 : Section 7 69

Effect in of Hann window in time-domain

0 20 40 60 80 100 120 140-100

-80

-60

-40

-20

0

20

40

60

80

100

0 20 40 60 80 100 120 140-100

-80

-60

-40

-20

0

20

40

60

80

100

Effect of rectangular window on sine-wave. Truncates suddenly at window edges.

Effect of Hann window on sine-wave. Tapers amplitudes towards zero at window edges.

Page 70: UNIVERSITY of MANCHESTER School of Computer Science Comp30291 Digital Media Processing 2009-10

8 Dec '09 Comp30291 : Section 7 70

Effect of rect window in frequency domain

• Previously, we analysed sinusoids of freq 0 & amplitude 10 with a rect window of width N samples.

• DTFT of sinusoid is sinc-like function with narrow main lobe of height 10N/2 centred 0 & on which drops to zero on either side within 2/N. Its width is /N.

•The DFT samples this main lobe at intervals of 2/N.

•With luck a sample right in the centre, with 2 others zero.

•Otherwise we miss the centre & get a lower amplitude.

Page 71: UNIVERSITY of MANCHESTER School of Computer Science Comp30291 Digital Media Processing 2009-10

8 Dec '09 Comp30291 : Section 7 71

DFT sampling main lobe of rectangular window

00

10N/210N/2

Discrepancy of up to 35% can occur because the main lobe is so narrow.

Page 72: UNIVERSITY of MANCHESTER School of Computer Science Comp30291 Digital Media Processing 2009-10

8 Dec '09 Comp30291 : Section 7 72

Hann window order 20 compared with rectangular.

-10

-5

0

5

10

15

20

25

-3.14 -2.355 -1.57 -0.785 0 0.785 1.57 2.355 3.14

Radians/sample

W(e

xp

(jw

))

Rect

R2

R3

Hann

Effect in of Hann window in frequency-domain

Take its DTFT & compare with that of rect window. (Amplitudes are doubled in graph)

Page 73: UNIVERSITY of MANCHESTER School of Computer Science Comp30291 Digital Media Processing 2009-10

8 Dec '09 Comp30291 : Section 7 73

Hann window in frequency-domain (i) broader main-lobe whose width is approx doubled(ii) Main lobe amplitude approx halved to 10N/4 (ii) reduced side-lobe levels.

00

10N/410N/4

• At least 3 samples per main-lobe now even when it centres on a freq sampling pt

•. Up to 35% amplitude variation with ‘rect’now reduces to at most about 15%

•. Often take highest of the 3 peaks as amplitude of sine-wave.

Page 74: UNIVERSITY of MANCHESTER School of Computer Science Comp30291 Digital Media Processing 2009-10

8 Dec '09 Comp30291 : Section 7 74

Other non-rectangular windows

Hamming : very similar to Hann but a little better

Kaiser: offers a range of options from rectangular, through Hamming towards windows with even lower ripples with broader main lobes.

MATLAB command: KW = kaiser(N,beta) produces a Kaiser window array of length N for any value of beta () > 0.

When = 0, this is a rectangular window.

When = 5.4414 we get a Hamming window.

Increasing further gives further reduced ripples & broader main-lobe.

Blackman, flat-top, Bartlet, and many more exist..

Page 75: UNIVERSITY of MANCHESTER School of Computer Science Comp30291 Digital Media Processing 2009-10

8 Dec '09 Comp30291 : Section 7 75

clear all; N=64; H=Hann(N);for n=0:N-1 x(1+n) = 100*sin((pi/4)*n) + 100*sin(1.522*n); wx(1+n) =x(1+n)*H(1+n);end; figure (1); plot(x); figure(2); plot(wx);X=fft(x)/(N/2); figure(3); plot(abs(X(1:N/2)),'*-'); grid on;WX=fft(wx)/(N/2);figure(4); plot(abs(WX(1:N/2)),'*-'); grid on;

Demonstrate effect of Hann window on DFT

The MATLAB program below analyses 2 sine-waves as before with & without Hann window.

Page 76: UNIVERSITY of MANCHESTER School of Computer Science Comp30291 Digital Media Processing 2009-10

8 Dec '09 Comp30291 : Section 7 76

0 5 10 15 20 25 30 350

20

40

60

80

100

120

Effect of Hann window

0 10 20 30 40 50 60 70-200

-150

-100

-50

0

50

100

150

200

0 10 20 30 40 50 60 70-100

-50

0

50

100

150

200

0 5 10 15 20 25 30 350

5

10

15

20

25

30

35

40

45

50

Rectangular Hann window

Page 77: UNIVERSITY of MANCHESTER School of Computer Science Comp30291 Digital Media Processing 2009-10

8 Dec '09 Comp30291 : Section 7 77

• Reduction in amplitude estimation error for sine-waves is at expense of some loss of spectral resolution

• With Hann window, 3 bins strongly affected by one sine-wave

• We will only know that the frequency of the sine-wave lies within the range of these 3 frequencies.

• Amplitude is halved so we should scale be factor two.

Effect of Hann window (summary)

Page 78: UNIVERSITY of MANCHESTER School of Computer Science Comp30291 Digital Media Processing 2009-10

8 Dec '09 Comp30291 : Section 7 78

Energy of {x[n]}0,N-1 and & Parseval’s Theorem

• Energy of {x[n]}0,N-1 is:

1

0

2])[(N

n

nxE

If we convert {x[n]}0,N-1 to an analogue voltage & applied it to a 1 Ohm heater or loudspeaker, we obtain E Joules.

Parseval's Theorem for the DFT It may be shown that for a real signal segment {x[n]}0,N-1:

1

0

21

0

2 ][1

])[(N

k

N

n

kXN

nx

Page 79: UNIVERSITY of MANCHESTER School of Computer Science Comp30291 Digital Media Processing 2009-10

8 Dec '09 Comp30291 : Section 7 79

• Parseval’s Theorem allows energy to be calculated in frequency-domain instead of in time-domain.

• Allows us to see how energy is distributed in frequency.

• Is there more energy at high frequencies than at low frequencies or vice-versa?

Energy & Parseval’s Thm (cont)

Page 80: UNIVERSITY of MANCHESTER School of Computer Science Comp30291 Digital Media Processing 2009-10

8 Dec '09 Comp30291 : Section 7 80

Spectral analysis of 'power signals' {x[n]}

• A ‘power signal’ {x[n]} exists for all time & has infinite energy.

• Its ‘power’ (Watts) is its mean square value.

• If we extract segment {x[n]}0,N-1 to represent {x[n]}.

• Energy of {x[n]}0,N-1 is:

1

0

2])[(N

n

nxE

• Mean square value (MSV) of this segment is (1/N) E

• This is “power” of a periodic discrete time signal, of period N samples, for which a single cycle is {x[n]}0,N-1 . • It may be used as an estimate of the power of {x[n]}.

Page 81: UNIVERSITY of MANCHESTER School of Computer Science Comp30291 Digital Media Processing 2009-10

8 Dec '09 Comp30291 : Section 7 81

• By Parseval's thm, estimate of power of {x[n]}, obtained by analysing {x[n]}0,N-1 , is:

1

0

221

0

2 ])[()/1( ])[()/1(N

k

N

n

kXNnxN

• Usefulness of this estimate illustrated by following example.

Page 82: UNIVERSITY of MANCHESTER School of Computer Science Comp30291 Digital Media Processing 2009-10

8 Dec '09 Comp30291 : Section 7 82

10

20

30

40

00 2 31 4 5 6 7 8 10 12 14 16 18 20

k

| X[k] |

Example: Real periodic signal {x[n]} is rect windowed to give {x[n]}0,39 . 40-point DFT gives magnitude spectrum below.Estimate power of {x[n]} & comment on reliability of estimate.If {x[n]} is passed thro’ ideal digital low-pass filter with cut-off /2 radians/sample how is power likely to be affected?

Page 83: UNIVERSITY of MANCHESTER School of Computer Science Comp30291 Digital Media Processing 2009-10

8 Dec '09 Comp30291 : Section 7 83

Ans: MSV of {x[n]}0,39 power of {x[n]} = (1/1600)[2*402 +2*302 +2*202+2*102] = 3.75 watts.Reduces to 3.125 watts, i.e. by 0.8 dB

• Care needed to interpret such results as power estimates.

• For periodic or deterministic (non-random) signals: estimates from segments extracted from different parts of {x[n]} may be similar, & estimates could be fairly reliable.

• For random signals : may be considerable variation from estimate to estimate. Averaging may be necessary.

Page 84: UNIVERSITY of MANCHESTER School of Computer Science Comp30291 Digital Media Processing 2009-10

8 Dec '09 Comp30291 : Section 7 84

Power spectral density (PSD) estimate

• For N-point DFT, X[k]2 / N2 is estimate of PSD in Watts per “bin”.

• A “bin” is a band-width 2/N radians/sample centred on k

( fS / N Hz centred on k fS / N )

• Instead of |X[k]| often plot 10 log10 (|X[k]|2/N2) dB. against k.• “PSD estimate” graph.

• Careful with random signals : each spectral estimate different. • Can average several PSD estimates.