[IEEE 2014 4th Joint Workshop on Hands-free Speech Communication and Microphone Arrays (HSCMA) -...

NEAR-FIELD SOURCE LOCALIZATION USING SPHERICAL MICROPHONE ARRAY

Lalan Kumar, Kushagra Singhal, and Rajesh M Hegde

Indian Institute of Technology Kanpur

{rhegde,lalank}@iitk.ac.in

ABSTRACT

Source localization using spherical microphone arrays has re-

ceived attention due to the ease of array processing in the

spherical harmonics (SH) domain with no spatial ambigu-

ity. In this paper, we address the issue of near-field source

localization using a spherical microphone array. In particu-

lar, three methods that jointly estimate the range and bear-

ing of multiple sources in the spherical array framework, are

proposed. Two subspace-based methods called the Spherical

Harmonic MUltiple SIgnal Classification (SH-MUSIC) and

the Spherical Harmonics MUSIC-Group Delay (SH-MGD)

for near field source localization, are first presented. Addi-

tionally, a method for near-field source localization using the

Spherical Harmonic MVDR (SH-MVDR) is also formulated.

Experiments on near-field source localization are conducted

using a spherical microphone array at various SNR. The SH-

MGD is able to resolve closely spaced sources when com-

pared to other methods.

Index Terms— MUSIC, Spherical Harmonics, Near-

field, Group delay

1. INTRODUCTION

Spherical microphone array processing has been a growing

area of research in the last decade [1, 2]. This is primarily

because of the relative ease with which array processing can

be performed in the spherical harmonics (SH) domain without

any spatial ambiguity [3].

Various algorithms have been proposed for far-field

source localization using spherical microphone array. Esti-

mation of Signal Parameters via Rotational Invariance Tech-

niques (ESPRIT) [4] algorithm is extended for spherical array

in [5]. Multiple SIgnal Classification (MUSIC) [6] is imple-

mented in terms of spherical harmonics in [7]. In [8], room

acoustics analysis is presented using spherical array, based on

SH-MUSIC in frequency domain. All these source localiza-

tion methods deals with planar wavefront of far-field sources.

However, in applications like Close Talk Microphone (CTM),

video conferencing etc, the planar wavefront assumption is

This work was funded by the DST project EE/SERB/20130277. The

author L. Kumar was supported by TCS Research Scholarship Program

TCS/CS/20110191.

no more valid. In [9], design of a low order spherical micro-

phone array is proposed to acquire the sound from near field

sources. Near-field criterion for spherical array is discussed

in [10]. However, spherical array has not been utilized for

near-field source localization. In [11], 2-Dimensional (2D)

MUSIC spectrum is presented for multiple near-field sources

using Uniform Linear Array (ULA). In this work, we propose

3D SH-MUSIC spectrum for range and bearing (elevation,

azimuth) estimation of multiple near-field sources. MVDR

[12] and MUSIC-Group Delay (MGD) spectrum [13–16]

have also been studied for near-field source localization using

spherical array of microphone. The primary contribution of

this work is in the proposal of novel methods for near-field

source localization in spherical harmonics domain.

The rest of the paper is organized as follows. In Section 2,

signal model in spherical harmonics domain is presented. The

near-field criteria is discussed, followed by the development

of SH-MUSIC, SH-MGD and SH-MVDR methods. The pro-

posed method is evaluated in Section 3. Section 4 concludes

the paper.

2. NEAR-FIELD SOURCE LOCALIZATION USING

SPHERICAL MICROPHONE ARRAY

In this Section, a mathematical derivation of 3-Dimensional

MUSIC spectrum is presented using spherical harmonics for

near-field sources. The SH-MUSIC utilizes the magnitude

spectrum. However, magnitude spectrum suffers from se-

vere environmental conditions like low SNR, reverberation

and closely spaced sources. In [16], a high resolution source

localization based on the MUSIC-Group delay spectrum over

ULA has been proposed. The method is non-trivially ex-

tended for planar arrays in [14, 15] and for spherical array

in [13]. In all these works, far-field source were considered.

In this work, group delay spectrum in spherical harmonics

domain has been developed for range and bearing estimation.

Beamforming based SH-MVDR is also formulated for near-

field source localization.

2.1. Signal processing in Spherical Harmonics domain

A spherical microphone array of order N with radius r and

number of sensors I is considered. A sound field of spherical-

waves with wavenumber k from L near-field sources is in-

cident on the array. The lth source location is denoted by

2014 4th Joint Workshop on Hands-free Speech Communication and Microphone Arrays (HSCMA)

978-1-4799-3109-5/14/$31.00 ©2014 IEEE 82

rl = (rl,Ψl), where Ψl = (θl, φl). The elevation angle θ

is measured down from positive z axis, while the azimuthal

angle φ is measured counterclockwise from positive x axis.

Similarly, the ith sensor location is given by ri = (r,Φi),where Φi = (θi, φi).

In spatial domain, the sound pressure at I microphones,

p(k) = [p1(k), p2(k), . . . , pI(k)]T , is written as

p(k) = V(k)s(k) + n(k) (1)

where V(k) is I × L steering matrix, s(k) is L × 1 vector

of signal amplitudes, n(k) is I × 1 vector of zero mean, un-

correlated sensor noise and (.)T denotes the transpose. The

steering matrix V(k) is expressed as

V(k) = [v1(k),v2(k), . . . ,vL(k)], where (2)

vl(k) = [e−jk|r1−rl|

|r1 − rl|, . . . ,

e−jk|rI−rl|

|rI − rl|]T (3)

Denoting the acoustic pressure on the surface of the

sphere by p(k, r, θ, φ), the Spherical Fourier Transform (SFT)

and its inverse is defined by [17],

pnm(k, r) =

∫ 2π

0

∫ π

0

p(k, r, θ, φ)[Y mn (θ, φ)]∗ sin(θ)dθdφ

(4)

p(k, r, θ, φ) =

∞∑n=0

n∑m=−n

pnm(k, r)Y mn (θ, φ) (5)

where Y mn (θ, φ) is spherical harmonic of order n and degree

m defined in Equation 6, and (.)∗ denotes the complex conju-

gate.

Y mn (θ, φ) =

√(2n+ 1)(n−m)!

4π(n+m)!Pmn (cosθ)ejmφ (6)

It is to be noted that Y mn are solution to the Helmholtz equa-

tion [18] and Pmn are associated Legendre function.

The acoustic pressure is sampled by the microphones on

the surface of the sphere. Hence, the SFT in the Equation 4

can be approximated by following summation

pnm(k, r) ∼=

I∑i=1

aip(k, r,Φi)[Ynm(Φi)]∗ (7)

∀0 ≤ n ≤ N,−n ≤ m ≤ n

where ai are the sampling weights [19]. For order limited

pressure function with order N , Equation 5 can be written as

p(k, r,Φ) ∼=

N∑n=0

n∑m=−n

pnm(k, r)Y mn (Φ) (8)

The pressure at the ith microphone due to the lth source

is p(k, r,Φi) =e−jk|ri−rl|

|ri−rl|and it is given by [20]

e−jk|ri−rl|

|ri − rl|=

N∑n=0

n∑m=−n

bn(k, r, rl)Ymn (Ψl)

∗Y mn (Φi) (9)

where bn(k, r, rl) is the near-field mode strength. It is related

to far-field mode strength bn(kr) as [21]

bn(k, r, rl) = j−(n−1)kbn(kr)hn(krl) where, (10)

bn(kr) = 4πjnjn(kr), open sphere (11)

= 4πjn(jn(kr)−

j′n(kr)

h′n(kr)hn(kr)

), rigid sphere

(12)

jn is spherical Bessel function, hn is spherical Hankel func-

tion, j is unit imaginary number and ′ refers to first derivative.

The extra term in far-field mode strength for rigid sphere ac-

counts for scattered pressure from the sphere. The range of

the source is captured in the Hankel function.

10−1

100

101−150

−100

−50

0

50

k

Mag

nitude(dB)

Far−fieldNear−field

Fig. 1. Plot showing the nature of far-field and near-field

mode strength for rigid sphere. Near-field source is at rl =1m and order is varied from n = 0 (top) to n = 4 (bottom)

2.2. Near-field criterion in spherical harmonics domain

In general, the boundary between near-field and far-field is

decided by Fraunhofer distances [22]. However, these pa-

rameters do not indicate the extent of near-field in spherical

harmonics domain. For spherical array, the near-field crite-

ria is presented in [10] based on similarity of near-field mode

strength (|bn(k, r, rl)|) and far-field mode strength (|bn(kr)|).The two functions start behaving in similar way at krl ≈ N ,

for array of order N . This is illustrated in Figure 1 for rigid

sphere Eigenmike system [23] with rl = 1m and order vary-

ing from n = 0 to n = 4. Hence the near-field condition for

spherical array becomes

rNF ≈N

k(13)

But rNF ≥ r, r being the radius of the sphere. So the highest

wavenumber possible is

kmax =N

r(14)

From Equations 13,14, rNF = rkmax

k(15)

Hence, for a source to be in near-field, the range of the source

should satisfy

r ≤ rl ≤ rkmax

k(16)


83

020

4060

80100

020

4060

80100

0

0.5

1

1.5

2

Azimuth(φ)Elevation(θ)

SH−MUSIC

(a)

020

4060

80100

020

4060

80100

0

20

40

60

80

100

Azimuth(φ)Elevation(θ)

SH−MGD

(b)

0 10 20 30 40 50 60 70 80 900

0.5

10

0.2

0.4

0.6

0.8

1

Azimuth(φ)

Range(m)

SH−MUSIC

(c)

0 10 20 30 40 50 60 70 80 900

0.5

10

0.2

0.4

0.6

0.8

1

Azimuth(φ)

Range(m)

SH−MGD

(d)

Fig. 2. Illustration of Azimuth and Elevation estimation by (a) SH-MUSIC (b)SH-MGD. Illustration of range and azimuth

estimation using (c) SH-MUSIC (d) SH-MGD. The sources are at (0.4m,60◦,30◦) and (0.5m,55◦,35◦) at SNR 10dB.

2.3. The Spherical Harmonics MUSIC (SH-MUSIC)

spectrum for near-field source localization

This section presents formulation of the proposed SH-MUSIC

spectrum for near-field source localization. Substituting the

expression for pressure from Equation 9 in Equation 3, the

steering matrix in Equation 2 can be written as

V(k) = Y(Φ)[B(r1)yH(Ψ1), · · · ,B(rL)y

H(ΨL)] (17)

where Y(Φ) is I × (N + 1)2 matrix. A particular ith row

vector can be written as

y(Φi) = [Y 00 (Φi), Y

−11 (Φi), Y

01 (Φi), Y

11 (Φi), . . . , Y

NN (Φi)]

(18)

and y(Ψl) is 1× (N +1)2 vector with similar structure as in

Equation 18 with angle Ψl, l = 1, 2, · · · , L. The (N +1)2 ×(N + 1)2 matrix B(rl) is given by

B(rl) = diag(b0(k, r, rl), b1(k, r, rl), b1(k, r, rl),

b1(k, r, rl), . . . , bN (k, r, rl)) (19)

Dependency of B(rl) on k and r is dropped for notational

simplicity. Substituting (17) in (1), multiplying both side by

YH(Φ)Γ and utilizing Equation 7, the data model becomes

pnm(k, r) = YH(Φ)ΓY(Φ)[B(r1)yH(Ψ1), · · · ,

B(rL)yH(ΨL)]s(k) + nnm(k) (20)

where Γ = diag(a1, a2, · · · , aI), consists of sampling

weights used in Equation 7 and

pnm = [p00, p1(−1), p10, p11, · · · , pNN ]T . (21)

The orthogonality of spherical harmonics under spatial sam-

pling suggests [19]

YH(Φ)ΓY(Φ) ∼= I. (22)

Hence, the data model finally becomes

pnm(k, r) = [B(r1)yH(Ψ1), · · · ,B(rL)y

H(ΨL)]s(k)

+ nnm(k) (23)

where B(rs)yH(Ψs) is taken to be look-up steering vector.

The 3-Dimensional MUSIC spectrum in spherical harmonics

domain can now be written as

PMUSIC(rs,Ψs) =1

y(Ψs)BHSNSpnm

[SNSpnm

]HByH(Ψs)(24)

The search is performed over rs as in Equation 16 and over

Ψs with (0 ≤ θs ≤ π, 0 ≤ φs ≤ 2π). SNSpnm

is noise sub-

space obtained from eigenvalue decomposition of autocorre-

lation matrix, Spnm, defined as

Spnm= E[pnm(k, r)pnm(k, r)H ] (25)


84

The denominator of the MUSIC spectrum tends to zero when

(rs,Ψs) corresponds to source location owing to orthogonal-

ity between noise eigenvector and steering vector. Hence, a

peak is obtained in MUSIC spectrum.

2.4. Near-field source localization using Spherical Har-

monic MUSIC-Group Delay (SH-MGD) spectrum

The SH-MUSIC utilizes the magnitude of y(Ψs)BHSNS

pnm

as

it is clear from Equation 24 . The phase spectrum of MUSIC

is utilized in [13–16] for robust source localization. A sharp

change in unwrapped phase is seen at the Direction of Arrival

(DOA) [14, 16]. Hence, the negative differentiation of un-

wrapped phase spectrum (Group delay) results in peak at the

DOAs. In practice, abrupt changes can occur in the phase due

to small variations in the signal caused by microphone cali-

bration errors. Hence, the group delay spectrum sometimes

may have spurious peaks. The product of MUSIC and Group

delay spectra, called MUSIC-Group delay, removes such spu-

rious peaks and gives high resolution estimation. The Spher-

ical Harmonics MUSIC-Group delay (SH-MGD) spectrum is

computed as

PMGD(rs,Ψs) = (

U∑u=1

|∇arg(y(Ψs)BH .qu)|

2).PMUSIC

(26)

where U = (N + 1)2 − L, ∇ is the gradient operator, arg(.)indicates unwrapped phase, and qu represents the uth eigen-

vector of the noise subspace, SNSpnm

. The first term within (.)is the group delay spectrum. The gradient is taken with re-

spect to (rs, θs, φs).Figure 2 illustrates the performance of SH-MUSIC and

SH-MGD for range and bearing estimation using spherical

microphone array. The simulation was done considering open

sphere with two closely spaced sources at (0.4m,60◦,30◦),(0.5m,55◦,35◦) and SNR 10dB. Figure 2(a) and 2(b) show

plots corresponding to elevation and azimuth estimation. It is

clear that SH-MGD exhibits higher resolving power. Plots in

Figure 2(c) and 2(d) show range and azimuth of the sources.

The high resolution of MGD is due to additive property of

group delay spectrum. The additive property is proved math-

ematically in our earlier work for ULA [16] and UCA [15].

While this is valid for spherical array also, the mathematical

proof is being developed.

2.5. The Spherical Harmonics MVDR (SH-MVDR) spec-

trum for range and bearing estimation

The conventional MVDR minimizes the contribution of inter-

ference impinging on the array from a DOA �= Ψs, while it

maintains certain gain in look direction Ψs. On the similar

lines, the SH-MVDR spectrum for near-field source localiza-

tion, can be written as

PMVDR(rs,Ψs) =1

y(Ψs)BHS−1pnm

ByH(Ψs)(27)

3. PERFORMANCE EVALUATION

The proposed methods, SH-MUSIC, SH-MGD and SH-

MVDR are evaluated by conducting experiments on source

localization. The estimated range and bearing are tabulated at

various SNRs.

The proposed algorithm was tested in a room with dimen-

sions, 7.3m × 6.2m × 3.4m. An Eigenmike microphone ar-

ray [23] was used for the simulation. It consists of 32 mi-

crophones embedded in a rigid sphere of radius 4.2 cm. The

order of the array was taken to be N = 4. The source local-

ization experiments are conducted at various SNR.

3.1. Experiments on source localization

Two sets of experiments were conducted. For the first experi-

ment, two closely spaced narrowband sources were placed in

near-field region at (0.4m,60◦,30◦) and (0.4m,65◦,35◦). The

range of the sources was kept fixed at 0.4m. The experiments

were conducted at SNR 0dB and 8dB. The additive noise is

assumed to be zero mean Gaussian distributed. The mean

estimation for azimuth and elevation is presented in the first

part of the Table 1. In the second experiment, the sources

were positioned at (0.4m,60◦,30◦) and (0.5m,65◦,35◦). The

range and the azimuth were estimated at SNR 5dB and 10dB,

considering fixed elevation. The result shown in Table 1 is

obtained from 300 independent Monte Carlo trials. It is clear

that SH-MGD performs reasonably better than SH-MUSIC.

Both of these methods outperform MVDR.

Table 1. Localization experiments, Set 1 : SNR 0dB, 8dB for

fixed range. Set 2 : SNR 5dB, 10dB for fixed elevation

SNR S SH-MGD SH-MUSIC MVDR

0dBS1 (60.46,29.82) (60.04,30.02) (58.35,29.22)

S2 (65.01,34.94) (65.00,35.00) (63.67,34.19)

8dBS1 (60.00,29.96) (60.00,29.99) (61.15,29.33)

S2 (65.00,35.00) (65.00,35.00) (63.65,34.43)

5dBS1 (0.416,29.91) (0.429,30.11) (0.367,29.26)

S2 (0.548,34.91) (0.560,34.49) (0.541,33.28)

10dBS1 (0.409,30.00) (0.410,30.00) (0.406,30.06)

S2 (0.510,35.00) (0.514,35.00) (0.548,33.40)

4. CONCLUSION

In this work, 3-Dimensional SH-MUSIC, SH-MGD and SH-

MVDR are proposed for near-field source localization. Since

the phase spectrum of MUSIC is more robust to noise, the SH-

MGD indicates higher resolution. The proof of additive prop-

erty of group delay in the spherical harmonics domain is cur-

rently being developed. The detailed relative performance of

SH-MUSIC and SH-MGD for closely spaced sources under

reverberation will be addressed in future work. The Cramer-

Rao bound for spherical harmonics is being developed for the

performance analysis of the proposed methods.


85

References

[1] Jens Meyer and Gary Elko, “A highly scalable spherical

microphone array based on an orthonormal decomposi-

tion of the soundfield,” in Acoustics, Speech, and Signal

Processing (ICASSP), 2002 IEEE International Confer-

ence on. IEEE, 2002, vol. 2, pp. II–1781.

[2] John McDonough, Kenichi Kumatani, Takayuki

Arakawa, Kazumasa Yamamoto, and Bhiksha Raj,

“Speaker tracking with spherical microphone arrays,”

in Acoustics Speech and Signal Processing (ICASSP),

2013 IEEE International Conference on. IEEE, 2013.

[3] Israel Cohen and Jacob Benesty, Speech processing in

modern communication: challenges and perspectives,

vol. 3, Springer, 2010.

[4] R Roy, A Paulraj, and T Kailath, “Estimation of signal

parameters via rotational invariance techniques-esprit,”

in 30th Annual Technical Symposium. International So-

ciety for Optics and Photonics, 1986, pp. 94–101.

[5] Roald Goossens and Hendrik Rogier, “Closed-form

2d angle estimation with a spherical array via spherical

phase mode excitation and esprit,” in Acoustics, Speech

and Signal Processing, 2008. ICASSP 2008. IEEE Inter-

national Conference on. IEEE, 2008, pp. 2321–2324.

[6] R. O. Schmidt, “Multiple emitter location and signal

parameter estimation,,” IEEE Transactions on Antenna

and Propagation,, vol. AP-34, pp. 276–280, 1986.

[7] Xuan Li, Shefeng Yan, Xiaochuan Ma, and Chaohuan

Hou, “Spherical harmonics music versus conventional

music,” Applied Acoustics, vol. 72, no. 9, pp. 646–652,

2011.

[8] Dima Khaykin and Boaz Rafaely, “Acoustic analysis

by spherical microphone array processing of room im-

pulse responses,” The Journal of the Acoustical Society

of America, vol. 132, pp. 261, 2012.

[9] Jens Meyer and Gary W Elko, “Position independent

close-talking microphone,” Signal processing, vol. 86,

no. 6, pp. 1254–1259, 2006.

[10] E. Fisher and B. Rafaely, “The nearfield spherical mi-

crophone array,” in Acoustics, Speech and Signal Pro-

cessing, 2008. ICASSP 2008. IEEE International Con-

ference on, 2008, pp. 5272–5275.

[11] Y-D Huang and Mourad Barkat, “Near-field multiple

source localization by passive sensor array,” Antennas

and Propagation, IEEE Transactions on, vol. 39, no. 7,

pp. 968–975, 1991.

[12] Jack Capon, “High-resolution frequency-wavenumber

spectrum analysis,” Proceedings of the IEEE, vol. 57,

no. 8, pp. 1408–1418, 1969.

[13] Lalan Kumar, Kushagra Singhal, and Rajesh M Hegde,

“Robust source localization and tracking using music-

group delay spectrum over spherical arrays,” in Compu-

tational Advances in Multi-Sensor Adaptive Processing

(CAMSAP), 2013 IEEE 5th International Workshop on,

St. Martin, France. IEEE, 2013, pp. 304–307.

[14] Lalan Kumar, Ardhendu Tripathy, and Rajesh M Hegde,

“Robust multi-source localization over planar arrays us-

ing music-group delay spectrum,” IEEE Trans. on Sig-

nal Processing, Under review, 2014.

[15] Ardhendu Tripathy, L Kumar, and Rajesh M Hegde,

“Group delay based methods for speech source localiza-

tion over circular arrays,” in Hands-free Speech Com-

munication and Microphone Arrays (HSCMA), 2011

Joint Workshop on. IEEE, 2011, pp. 64–69.

[16] Mrityunjaya Shukla and Rajesh M Hegde, “Significance

of the music-group delay spectrum in speech acquisition

from distant microphones,” in Acoustics Speech and

Signal Processing (ICASSP), 2010 IEEE International

Conference on. IEEE, 2010, pp. 2738–2741.

[17] James R Driscoll and Dennis M Healy, “Computing

fourier transforms and convolutions on the 2-sphere,”

Advances in applied mathematics, vol. 15, no. 2, pp.

202–250, 1994.

[18] Earl G Williams, Fourier acoustics: sound radiation

and nearfield acoustical holography, Access Online via

Elsevier, 1999.

[19] Boaz Rafaely, “Analysis and design of spherical mi-

crophone arrays,” Speech and Audio Processing, IEEE

Transactions on, vol. 13, no. 1, pp. 135–143, 2005.

[20] Boaz Rafaely, “Plane-wave decomposition of the sound

field on a sphere by spherical convolution,” The Journal

of the Acoustical Society of America, vol. 116, pp. 2149,

2004.

[21] Etan Fisher and Boaz Rafaely, “Near-field spherical mi-

crophone array processing with radial filtering,” Audio,

Speech, and Language Processing, IEEE Transactions

on, vol. 19, no. 2, pp. 256–265, 2011.

[22] Constantine A Balanis, Antenna theory: analysis and

design, John Wiley & Sons, 2012.

[23] The Eigenmike Microphone Array,

http://www.mhacoustics.com/.


86

[IEEE 2014 4th Joint Workshop on Hands-free Speech Communication and Microphone Arrays (HSCMA) -...

Documents

Transcript of [IEEE 2014 4th Joint Workshop on Hands-free Speech Communication and Microphone Arrays (HSCMA) -...