Acoustic impulse response measurement using speech and music signals John Usher Barcelona Media –...

Acoustic impulse response measurement Acoustic impulse response measurement using speech and music signalsusing speech and music signals

John UsherJohn Usher

Barcelona Media – Innovation Centre | Av. Diagonal, 177, planta 9, 08018 Barcelona

John Usher -- In-situ RIR measurement using music and speech

2

Using adaptive filters to estimate acoustic IRs

In-situ acquisition of electro-acoustic IR, with audience.Continuous:

Fast enough for changing environment conditions.Use speech and music signal radiated from loudspeaker.AF for IR is nothing new! Used for:

Acoustic echo and feedback cancellation. Upmixing (2 → 5.1, 2 → 3D). ANC. Room EQ (using noise).


3

Audio source(voice or music) LS

AF update

Adaptive Filter (AF = h)

+

_

Mic.error

h~

∑

Adaptive Filter is updated to model the acoustic IR so that the error signal level (power) is minimized.

Basic principle:


4

TD and FD smoothing

Homo. Deco.

Audio source(voice or music)

Filter inversion

RIR estimationh

EQ filter

h

~

Application for room EQ (filtered-x)


5

Localizing objects in a room

Emit speech warning from loudspeaker in room.Extract RIR using adaptive filter.Detect reflection onset timing, e.g. using running kurtosis.


6

Application for live sound: De-noising & spatial re-mixing

Audio source(voice or music) LS

AF update(NLMS)

Adaptive Filter (AF = h)

+

_

Mic.error

h~

∑

Room signal

Audience signal (applause etc.)

Clean audio signal (from the desk)


7

Filter update algorithm (NLMS):

x(n) LS

Update

h(n)

+

_

Mic.e(n)

h~

∑

1.

2.

y(n)


9

Small-room experiment set-up:

Audio source(voice or music)

Blah blah blah...

A. Source is loudspeaker reproducing noise, speech or music.Multichannel noise from loudspeakers.

B. Source is live spoken voice.Predict IR between two lav. mics.

Lav. 1 Lav. 2

Noise signal(white noise or babble)


10

Results

Error Criterion:1)Start with reference RIR (measured using swept-sine technique).

2)Allow Adaptive Filter to converge for 10 seconds to get AF spectra.

Calculate misalignment: mean of difference between the ref. and AF spectra (80 Hz-- 12 kHz):


11

Rate of Convergence


14

Comparison of filter spectra using noise, speech and music:(High SNR)


15

Robustness to SNR (25, 12, 3 dB SNR):

Masker = noise.


16

Robustness to SNR:Masker = babble


17

Comparison with DCFFT:

Dual Channel FFT method:

Following AES reviewer recommendation, compared with commercial DCFFT system (“SMAART”).


18

Comparison of NLMS vs DCFFT:


19

Effectiveness of AF RIR acquisition method with long RIRs.

6 RIRs:

Obtained from Dirac fed into Altiverb.

(NB: No background noise simulated.)

Football stadium, Caen Cathedral, church, EMT plate, Filmorch. Stage Berlin, Castle.

RT60: 9.6-1.1 secs.

1.2, 2.3, 3.5, 6.0, 7.8, 9.6.


20

What happens if we just model the early part of the IR?

… Not much: most of the spectral detail is in the early part.

For longer IRs, the adaptive filter should be longer.

Long

er R

T


24

Rate of Convergence for different RTs. 340 ms window, 32 x overlap.

Long

er R

T


25

RIR acquisition for small and large rooms :

Adaptive filter updated using NLMS and overlapped window.

Tested with RT60 = 0.5 -10 secs.

Using music, speech and noise as excitation signals.

Less accurate using live voice and two mics.

Convergence in <3 sec. (<2 dB mean error).

Little change in performance with SNRs down to 0 dB.

Conclusions:


26

Music vs speech:

Music: AF matches RIR 60 Hz—12 kHz.

Speech: AF matches RIR 100 Hz– 8 kHz.

No considerable improvement for filter sizes >340 ms. I.e. we only need to model first 1/8th of RIR to have a good approximation

of the spectrum.

Adaptive whitening algorithm (LPC residuals) can speed up convergence for highly coloured signals, but only in low SNRS.

Conclusions:


27

· In-situ continuous room EQ using filtered-x approach.

· Object localization using speech message.

(e.g. using running kurtosis).

· Re-mixing live music:

ambient sound separation using filter output and error signal (e.g. get clean signal + room ambiance + audience applause).

Applications:


28

Cheers!

John Usher


29


30

Acoustic impulse response measurement using speech and music signals John Usher Barcelona Media –...

Documents

Transcript of Acoustic impulse response measurement using speech and music signals John Usher Barcelona Media –...