Download - spherical microphone array processing with wave field synthesis and auralization

8/18/2019 spherical microphone array processing with wave field synthesis and auralization

1/109


2/109

ACKNOWLEDGEMENTS

This master’s thesis would not have been possible without the support of many people.Firstly, I wish to express my gratitude to Univ.-Prof. Dr.-Ing. Karlheinz Brandenburg

for giving me the chance to work on such an interesting topic in his group.

I owe my deepest gratitude to my supervisor Dipl.-Ing. Johannes Nowak for giving me

the opportunity to do this master thesis under his supervision. His constant guidance,

assistance and support were really important for this thesis.

Further, I wish to express my love and gratitude to my beloved family, especially my

parents, for their love, understanding and supporting me through the duration of my

study.

Finally, I thank to all my friends for their support during my studies.


3/109

ABSTRACT

Microphone arrays are structures which have atleast two or more microphones placed

at different positions in space generally in a geometrical fashion. In many applications

apart from temporal characteristics we also need the spatial charaterization of sonic

fields and in order to achieve this goal microphone arrays are employed.

In particular for spatial sound reproduction microphone arrays play a very important

role. Researchers have earlier used different microphone array configurations for thepurpose to sound recording, characterization of room acoustics and for auralization.

As the research in spatial sound reproduction progressed it was found that rendering

sound using an array of loud speaker elements is not sufficient to fully auralize an acous-

tic scene. It was proposed that microphone arrays be implemented on recording side

in order to reproduce a complete three dimensional acoustic behaviour. Researchers

have used different array configurations like planar arrays or circular arrays to map the

listening room acoustics for the purpose of auralization with some rendering system

e.g. wave field synthesis (WFS).

A drawback was noticed using two dimensional arrays, as they were not able to suffi-

ciently characterize an acoustic scene in three dimension hence spherical microphone

array came into picture. Spherical microphone arrays and there processing has been

described by many authors but a perceptual analysis of various factors which plague

the performance of spherical microphone array is still not established fully.

In the present work we do a detailed analysis of the processing chain which starts from

simulation of room characteristics with spherical microphone array, wave field analysisof the sound fields, classification of errors and Auralization of the free field impulse

responses. We bring together the existing state of the art in spherical microphone

array processing and look for the perceptual impact of different factors. We use a rigid

sphere configuration and analyze the three different error categories namely: position-

ing error, spatial aliasing and microphone noise. We attempt to establish a qualitative

and quantitative relation between the errors and limitations, encountered in spherical

microphone array processing and look for the psychoacoustic effects by auralizing the

free field data through WFS.


4/109

Spherical microphone array gives a complete three dimensional image of the acoustic

environment, the spherical microphone array data is decomposed into plane wave us-

ing plane wave decomposition. In process of plane wave decomposition the spherical

aperture of a spherical microphone array is discretized and because of this, limitations

get imposed on the performance of the array.

We simulate an ideal full audio spectrum wave field impact on the continuous aperture

of spherical microphone array and compare this with sampled array aperture.

In the listening test we auralized sound field based on the ideal wave field decomposition

of a continuous aperture and compare it with different degrees of errors in different

categories. By this comparision we attempt to establish the extent to which a said

error would perceptually corrupt a reproduced sound field. We also try to see the

extent to which some degree to error remains perceptually insignificant or in other

words the extent of error which can be tolerated.

We check out the spatial aliasing limit imposed by the redering system and on the

basis of that establish a base for the transform order ( l = 3) used in spherical array

processing. The perceptual analysis is done in two ways we first obtain an error

level which when incorporated in auralization process (simulated for l = 3) would be

perceptually insignificant. And then we look for the perceptual effects of this error

when stepwise tarnsform order l is changed.

We also try to establish a correspondence between wave field synthesis on the rendering

side and spherical microphone array on the measurement side. We investigate to

what extent wave field synthesis could retain the perceptual quality by analyzing the

psychoacoustic effects of changing various parameters on the spherical microphone

array side. The independence of rendering side in regard to the meausrement side is

also analysed.


5/109

Contents i

Contents

1 INTRODUCTION 1

1.1 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

1.2 Auralization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

1.3 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

1.4 Organization of Thesis . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

2 MATHEMATICAL ANALYSIS AND STATE OF THE ART 11

2.1 Acoustic wave equation . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

2.1.1 Homogeneous acoustic wave equation . . . . . . . . . . . . . . . 11

2.1.2 Solution of Wave equation in cartesian coordinates . . . . . . . 152.1.3 Solution of wave equation in spehrical coordinates . . . . . . . . 15

2.1.4 Spherical Bessel and Hankel functions . . . . . . . . . . . . . . . 18

2.1.5 Legendre functions . . . . . . . . . . . . . . . . . . . . . . . . . 20

2.1.6 Spherical harmonics . . . . . . . . . . . . . . . . . . . . . . . . 20

2.1.7 Radial Velocity . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

2.2 Spherical harmonic decomposition . . . . . . . . . . . . . . . . . . . . . 23

2.2.1 Interior and Exterior problem . . . . . . . . . . . . . . . . . . . 23

2.2.2 Spherical wave spectrum . . . . . . . . . . . . . . . . . . . . . . 25

2.3 Spherical wave sound fields . . . . . . . . . . . . . . . . . . . . . . . . . 26

2.4 Spherical harmonic expansion of plane wave . . . . . . . . . . . . . . . 28

2.5 Mode strength . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

2.6 Discretization of spherical aperture and spatial aliasing . . . . . . . . . 32

2.7 Plane wave decomposition . . . . . . . . . . . . . . . . . . . . . . . . . 34

2.8 Spatial resolution in plane wave decomposition . . . . . . . . . . . . . . 35

3 ERROR ANALYSIS 38

3.1 Measurement errors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38

Master Thesis Gyan Vardhan Singh


6/109

Contents ii

3.2 Description of measurement error function . . . . . . . . . . . . . . . . 39

3.3 Microphone noise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43

3.4 Spatial aliasing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44

3.5 Positioning error . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47

4 WAVE FIELD SYNTHESIS 48

4.1 Physics behind WFS . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48

4.2 Mathematical description of WFS . . . . . . . . . . . . . . . . . . . . . 51

4.3 Synthesis operator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56

4.4 Focusing operator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60

4.5 Practical Consequenses . . . . . . . . . . . . . . . . . . . . . . . . . . . 60

5 LISTENING TEST 62

5.1 Listening Test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62

5.2 Reproduction set up . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63

5.3 Auralization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64

5.3.1 Aspect to be percetually evaluated . . . . . . . . . . . . . . . . 65

5.3.2 Processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67

5.4 Structure of listening test . . . . . . . . . . . . . . . . . . . . . . . . . 69

5.4.1 Audio Tracks . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70

5.4.2 Listening test condition . . . . . . . . . . . . . . . . . . . . . . 70

5.5 Test subjects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73

5.6 Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74

5.6.1 Test subject screening . . . . . . . . . . . . . . . . . . . . . . . 74

5.6.2 Statistic for the evaluation of listening test . . . . . . . . . . . . 74

5.6.3 Definations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75

5.7 Spatial aliasing vs transform order . . . . . . . . . . . . . . . . . . . . 76

5.8 Evaluation of positioning error . . . . . . . . . . . . . . . . . . . . . . . 78

5.9 Microphone noise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82

6 Conclusions 84

Bibliography 86

List of Figures 91

List of Tables 93



7/109

Contents iii

APPENDIX 95

A Derivations 95

A.1 Orthonormality of Spherical harmonics and Spherical Fourier transform 95

A.2 Position vector and Wave vector . . . . . . . . . . . . . . . . . . . . . . 97

A.3 Plane wave pressure field for different levels . . . . . . . . . . . . . . . 99

A.4 Rigid sphere and open sphere configuration . . . . . . . . . . . . . . . . 100

Theses 102



8/109

1 INTRODUCTION 1

1 INTRODUCTION

Digital processing of sounds so that they appear to come from particular locations in

three-dimensional space is a very important and is an integral part of virtual acoustics.

In virtual acoustics the goal is simulation of the complex acoustic fields so that a listener

experiences a natural environment and this is done by spatial sound reproduction

systems.

In realization of spatial sound reproduction systems the concept of sound field synthesis

is used. Various methodologies and analytical approaches are combinedly defined by

the concept of sound field synthesis.

In sound field synthesis we decompose the sound or audio into various components orwave fields. In simple terms we pull apart the basic components of sound character-

izing various spatial and temporal properties. And then after implementing complex

signal processing techniques we reproduce sound in such a way that these components

merge together in the propagation medium to auralize a complete three-dimensaional

characteristic of the sound.

Hence, sound field synthesis is a principle where an acoustic environment is processed,

synthesized and reproduced or re-created such that the real acoustic scenario could

be perceived by the listener. Spatial sound, Immersive audio, 3 − D sound, surroundsound systems; these are some of the terms which are used often to describe such audio

systems.

Different aspects come into play in realizing a sound field reproduction system and

a very broad research work is being done attempting to understand various factors.

Some examples for sound field reproduction systems which deal with various concep-

tual aspects of signal processing are like wave field synthesis which is also our choice

of reproduction system used in this thesis, Higher order Ambisonic [1], sound field



9/109

1 INTRODUCTION 2

reproduction with MIMO acoustic channel inversion [2] and vector based amplitude

panning methods [3]. These are few of the spatial sound system examples which

were developed by respective researchers, in [4] the author has presented very detailed

mathematical treatment of various spatial sound reproduction techniques, he has at-

tempted to bring these related spatial sound system on a single mathematical plane

on the basis of functional analysis.

In the present work we put forward our analysis where in, we answer various ques-

tions which come up when an acoustic environment is recreated.Sound reproduction

techniques for virtual sound systems have been studied, developed and implemented

in various different ways and configurations. Acoustic auralization of sound fields in

this work, focuses on wave field analysis (WFA) [5] [6] concerning spherical micro-phone arrays, and their auralization on a 2-dimensional geometry of loudspeaker array

following the principle of wave field synthesis (WFS).

In order to obtain the acoustic scene characteristics, researchers have proposed the

usage of microphone arrays. Apart from temporal properties, for spatial sound repro-

duction we need the spatial properties of sound field as well and therefore microphone

arrays are required as they can characterize the sound in space as well [7][5][6]. Au-

ralization using microphone arrays have been attempted using various kind of array

geometries, In [5] the author has focused circular microphone array and used it for theauralization of sound fields with wave field synthesis.

In [8], Spatial Sound Design principles have been explained for Auralization of room

acoustics.

In spatial sound design the spatial properties of an audio stream like position, direction,

orientation in a virtual room and room itself are modified. Two things are attempted

the first being the simulation of an acoustic environment and the other is the direction

dependent visualization and modification of the sound field by the user. In this workwe focus on the part where simulation and auralization of an acoustic scene is done.

More importantly we investigate the factors influencing the microphone array used for

room impulse response recording (RIR) and analyse the perceptual effects which would

be observed during auralization process when various parameters of the microphone

array are changed.

Any sound wave can be represented as a superpositon of plane waves in far field of its

sources [9][8], and consequently it can also be said that a room can be characterized



10/109

1 INTRODUCTION 3

by its impulse responses as it can be assumed to be linear time invariant (LTI). Hence

if we are able to capture the room impulse responses of a room then we can fully

characterize the acoustic nature of that room and inturn any acoustic event in that

room can be reproduced simply with the help of plane wave decomposed components

of its room impulse responses.

1.1 Preliminaries

To understand how sound radiates in a medium we would like to introduce the reader

with soap bubble analogy as explained in [10, page 6-10] by Zotter. The sound radiation

is considered as a soap bubble as shown in problem, Figure 1.1. We assume a free

sound field and an ideal bubble of soap which is large enough to enclose a musician and

an instrument, now when the sound is produced by the instrument, the bubble surface

will vibrate according to the motion of air, because as sound propagates through the

medium it will hit the bubble and consequently the soap bubble will also vibrate with

the air molecules. At respective observation points on the sphere or the soap bubble,

the wave form of the vibrating sphere can be said to represent the radiated sound.

In [9], Williams has explained that acoustic sound radiation from the instrument couldbe completely defined if we are able to acoustically map the motion of this continuous

surface enclosing the sources. This kind of analysis of sound radiation is called exterior

problem .

In a similar way if we say that there are no sources inside (rather it is enclosing the

measurement set up or listening area) the soap bubble but instead the sound radiation

propagates from outside (i.e. the sources are outside) and the waves hit the bubble

from the exterior.

Now again as the bubble is in contact with the medium therefore it will vibrate and

identifying the motion of the surface of bubble would be sufficient to describe the

acoustic radiation, this is called interior problem .

In [11], exterior and interior problem have again been elaborated, as for our appli-

cation interior problem is more important hence we would put forward the interior

problem with respect to spherical microphone array. A more mathematical treatment



11/109

1 INTRODUCTION 4

Figure 1.1: Soap bubble model of acoustic radiation [10].

is explained in chapter 2 and for further detailed analysis the reader is referred to [9,

page 124].

In auralization applications we follow the same analytical line and characterize a lis-

tening room environment by measuring the impulse response coming from different

directions [7], this in turn gives us the directional behavior of the sound i.e. how direct

sound reaches and affects the spherical array and how the reflected sound behaves.

1.2 Auralization

In order to auralize the sound field keeping the spatial characteristics of sound alive,

method based on WFS is applied in the present work. WFS is a consequence of

Huygen’s Principle expressed mathematically by Kirchoff Helmhotz integrals [12]. In

[13] wave field synthesis is discussed in explicit details, In [13] Verheijen has explained



12/109

1 INTRODUCTION 5

the reproduction techniques stressing more on the loudspeaker arrays. Mathematical

description of WFS would be followed up in chapter 4. Over here its important to

point out that although WFS does reproduce the room effect but it is not sufficient for

recreation of an acoustic virtual reality [14][5] as it lacked the knowledge of acoustic

room impression which is obtained by wave field analysis of the room. Hence, WFS

with wave field analysis was further proposed in [6]. The proposed techniques suggest

that we obtain the room characteristics by measuring the room impulse responses and

then the analysis of these impulse responses can be used for calculating the driving

function for wave field synthesis. It is suggested that we can ideally reproduce the

reverberant part of sound with 8-11 uncorrelated plane waves [6].

In our application we attempt to use different configuration of plane waves in order tosynthesize the the sound waves. As mentioned by Sonke in [6] we check out palne wave

decompositon for different number of directions and finally settle with 12 direction.

We also compare the psychoacoustic effect of using different numbers of plane wave

sources and evaluate the optimal configuration.

In [5] the techniques and suggestions proposed in [6] are further explored and imple-

mented using a circular microphone array. There were two methodologies explained in

[5]:

• Impulse response based auralization

• Natural recording based auralization

Impulse response based auralization: In this approach the room acoustics are

measured and analyzed or impulse responses for the room are measured. For re-

production the room characteristics which are obtained from the impulse response

measurement are combined which dry audio channel and then reproduced. In more

simpler words if we suppose an audio file is being played in a particular room and nowthis acoustic scene has to be recreated in another room. Then the knowledge of the

room impulse responses (RIR)would be sufficient to recreate the same acoustic scene

by convolving the dry audio file with the directional responses obtained by plane wave

decomposition of RIR. Refer to Figure 1.2.

Natural recording based auralization: In this approach real time recording of the

sound field is done along with the audio signals. Here we do not record impulse re-

sponses separately but live recording of acoustic event is captured. In natural recording



13/109

1 INTRODUCTION 6

Figure 1.2: Impulse response based auralization [5].

based auralization no separate impulse responses or perceptual parameters are used,

refer to Figure 1.3.

In the present work we follow impulse response based auralization technique. One im-

portant factor for using this line of approach is that the measurement and reproduction

sites are independent of each other [8][11][5].

For auralization of acoustics environment for example a hall over an extended area,

impulse responses have to be measured along an array of microphone positions [15].

The impulse responses can be processed by three different techniques:

• Holophony



14/109

1 INTRODUCTION 7

Figure 1.3: Natural recording based auralization [5].

• Wave Field Extrapolation (WFE)

• Wave Field Decomposition (WFD)

Holophony: The impulse responses are measured at microphone positions which in

turn correspond to the loudspeaker positions in wave field synthesis. The impulseresponses in this approach can be directly used as convolution filters to derive the

corresponding loudspeaker and no further processing is required. Although this tech-

nique looks very straight forward but it is very inflexible since the output can not be

used for any other WFS layout other than the one specifically designed for the corre-

sponding microphone array set up used in the measurement. More over, in Holophony

it is required for microphones to have very sharp directivity patterns which is quite

unrealistic in practice [16][5].

Wave Field Extrapolation: In WFE the mesurement of impulse responses is not

necessarily done at positions that correspond to the WFS loudspeaker configuration.

The impulse responses measured from any particular kind of microphone array config-

uration are extrapolated to the required WFS loudspeaker array positions (in principle

these positions are different then that of the microphone positions, hence extrapola-

tion). The extrapolation can be done using Kirchhoff-Helmholtz integrals [5]. Although

auralization with wave field extrapolation has given satisfactory results but there are

some drawbacks, it requires a very large measurement array for a medium size extrap-



15/109

1 INTRODUCTION 8

olation area [7]. In [5] the author has shown that it atleast requires microphone array

size equivalent to that of the listening area in order to achieve satisfactory results.

Wave Field Decomposition, [17][15][18]: The wave field decomposition approachdecomposes the sound field into planes wave, which arrive from different directions.

The plane wave decompostion can be considered as an acoustic photograph of the

sound sources including secondary sources which can be regarded as the one generating

reflections [7].

The impulse responses are decomposed into plane waves which give the directional

image of the sound field. Further these plane wave are reproduced as point sources in

WFS set up. The measurement array and reproduction site are independent of each

other in this approach and we can reproduce the sound field for a larger area as com-

pared to the other two approaches. The size of the measurement array and that of the

loud speaker array have no dependence as far as the microphone array characterizes

the room sufficiently [5][8]. The plane wave obtained through plane wave decomposi-

tion can optimally represent the sources and reflections. And hence in principle we can

reproduce the sound field satisfactorily. Due to considerable advantages of wave field

decomposition over other methods we would focus our work on plane wave decomposi-

tion of the acoustic wave fields. In next chapter we present the analysis for plane wave

decomposition of spherical microphone array. In [5], circular microphone array wasimplemented for the purpose of auralization with WFS, but in order to obtain a three

dimensional plane wave decomposition use of spherical mircophone array necessiated

[11][8].

In our work we simulate the acoustics characteristics of a free field full spectrum wave

impact on spherical microphone array, and analyze it to obtain plane waves represent-

ing direct sources, reflections and reverberation part, these plane wave responses are

implemented in the driving filter of WFS and we try to auralize the sound. As a conse-

quence of spherical microphone array being important for three dimensional sampling

of acoustic radiation we study different aspects of spherical microphone arrayin this

work and investigate their influence on spatial sound reproduction.

Finally we auralize the sound field for different cases and perceptual listening test are

conducted. Different test subjects are invited to listen to our simulated wave fields

which are auralized using WFS spatial sound renderer consisting of an 88 element

lounspeaker in a 2 dimensional nearly circular geometry.



16/109


17/109

1 INTRODUCTION 10

1.4 Organization of Thesis

The thesis is divided into 6 chapters, the presentation of the work in this thesis hasbeen put forward as per the practical scheme of work.

We start with fundamentals and state of the art of spherical microphone arrays in

Chapter 2. We explain the basic mathematical fundamentals in this chapter and fur-

ther continue it to spherical harmonic decomposition, types of arrays and behaviour of

there radial filters, spatial sampling in spherical microphone array and then plane wave

decomposition. We also talk about the spatial resolution of plane wave decomposition

and its limitations.

In Chapter 3 we bring up the issues related with errors which get involved in the

due course of processing. Positioning error, microphone noise and spatial aliasing

are discussed in this chapter and the state of the art about how these errors are

incorporated in the theoretical analysis is also presented.

Chapter 4 explains about the wave field synthesis, basic theoritical backgound and

its limitations.

In Chapter 5 we first provide the description about the auralization process and then

go on with the description of Listening test and the related analysis of perceptual

effects of the errors and artifacts is presented.

Chapter 6 is the conclusion and it draws out the result more prominently and we

discuss the final suggestions of this work and the future work.



18/109


2 MATHEMATICAL ANALYSIS AND

STATE OF THE ART

In this chapter we talk about fundamentals of wave propagation and sound fields and

summarize the existing state of the art in spherical microphone array processing and

its auralization using wave field synthesis.

The work presented in this thesis is based on simulating free field room impulse re-

sponses with a spherical microphone array and further these impulse responses are

utilized for rendering the spatial sound with WFS set up.

2.1 Acoustic wave equation

The acoustic wave equation is the mathematical formulation of sound propagation

through a medium. This section provides the introductory basics of wave equation for

more discussion please refer [9][20][21].

2.1.1 Homogeneous acoustic wave equation

For the derivation of acoustic wave equation some basic assumptions are made [22]

[23] [24]:

1. The medium of propagation of sound waves is homogeneous, that is the material

characteristics of the medium remain time invariant.

2. The medium is quiescent, that is, it remains in a state of inactivity or dormancy.



19/109


3. Propagation medium is characterized as an ideal gas.

4. The state changes in the gas are modeled as adiabatic process i.e a process

that takes place without the transfer heat or matter between a system and itssurroundings.

5. The static pressure p0 and static density ρ0 are significant in comparison to the

pressure and density perturbations of wave propagation.

Independence of relevant parameters of the medium is assured by the first condition.

The second condition gives the assurance that parameters are independent of time

and there is no gross movement of the medium. Laws of ideal gases could be applied

as a result of assumption three, fourth assumption postulates that there is no energyexchange in the form of heat conduction in the medium i.e., there are no propagation

losses. And fifth assumption tells us that we can linearize the field variables and

medium characteristics around an operation point.

Two fundamental principles are used to derive the wave equation:

1. conservation of mass

2. the equation of momentum

Figure 2.1: Infinitesimal volume element used for the derivation of Euler’s equation.

The momentum equation tells us the relation between the force applied to a volume

element and the acceleration of the element due to this applied force. In figure 2.1 an

infinitesimal volume element is considered, we use this explanation for the derivation of

Euler’s equation [20][9]. Considering an infinitesimal volume element of fluid ∆x ∆y ∆z .



20/109


We say that all the six faces experience forces due to the pressure p(x,y,z ) in the fluid.

Assume that pressure on any one side is more than other side, therefore a force would

be exerted on the volume element and it would tend to move along the direction of the

force. From Newton’s laws of motion we relate this force with acceleration. If we carry

out the same analysis for all three directions, finally it ends up with Euler’s equation

which tells us the relation between the pressure applied on the fliud to changes in the

particle velicity of the fluid.

ρ0∂υ

∂t = −∇ p (2.1)

Here ρ0 is the fluid density, υ is the velocity vector at any position (x , y , z ) in the

medium.

υ = uex + υey + wez (2.2)

p is the pressure. ∇ is called gradient or nabla operator and is defined as

∇ ≡ ∂ ∂x

ex + ∂

∂yey +

∂

∂z ez (2.3)

where ex, ey and ez are unit vectors in x , y , z direction respectively, sometimes in liter-

ature they are also written as î , ̂j , k̂ . −∇ p is the pressure gradient and ∂υ∂t

is the change

in particle velocity.

The second equation which follows conservation of mass is given as [ 22][20]:

∂ρ

∂t + ρ∇υ = 0 (2.4)

where ρ is the density of propagation medium, υ is the acoustic particle velocity. As

the above five asuumptions are assumed to hold good hence equation 2.4 signifies

that time rate change in density of the medium is proportional to the gradient of time

rate change in the particle velocity of the medium times the density of the medium.

Further taking above assumptions into consideration the time derivative in equation



21/109


2.5 expresses the proportionality between time derivative of acoustic pressure and ρ ,

refer [20][25][22] for more detailed description.

∂p∂t

= c2 ∂ρ∂t

(2.5)

where p is the pressure which is a variable in position and time t and c is speed of sound.

Equation 2.5 gives the temporal derivative of density of propogation medium in terms

of changes in pressure, combining equation 2.5 and 2.4 in view of last assumption,

we get

−∂p∂t

= ρ0c2∇υ (2.6)

Equation 2.1 and 2.6 with intial and boundary conditions form a complete set of

first order partial differential equations with a unique solution. These equation can be

combined together to form a single second order equation [25][20]. The time derivative

of equation 2.6

− ∂ 2

∂t2 p = ρ0c

2∇ ∂ ∂t

υ (2.7)

replacing the particle velocity component in equation 2.7 with gradient of pressure

from Euler’s equation in equation 2.1 we obtain the the homogeneous wave equationgiven by equation

∇2 p − 1c2

∂ 2

∂t2 p = 0 (2.8)

where p is pressure which is a function of position and time t . Equation 2.8 can also

be represented in frequency domain by applying Fourier transformation with respect

to the time t to acoustic pressure p [26][9].

∇2P (r, ω) + ωc

2

k2

P (r, ω) = 0(2.9)

Equation 2.9 is known as Helmholtz equation , r is the position, r = (x,y,z ), ωc

is the

wave number k and ω = 2πf . Analytically it can be seen that k = 2π/λ, λ being wave

length, hence k is the amount of angle or radians acheived in one wave length, so if we

want to know the phase of a wave when it has travelled say 7/9λ then 7/9λ × k givesthe phase.



22/109


2.1.2 Solution of Wave equation in cartesian coordinates

The general solution of wave equation in cartesian coordinates is derived through the

Helmholtz equation in three dimensions [9].

P (r, ω) = A (ω) ei(kxx+kyy+kzz) (2.10)

where A(ω) is an arbitrary constant. Here we define k as

k2 = k2x + k2y + k

2z (2.11)

Another notation for plane wave solution is

P (r, ω) = A (ω) eikr (2.12)

In time domain the solution for wave equation is

p (t) = Aei(kxx+kyy+kzz−ω0t) (2.13)

p (t) = Aei(kr−ω0t) (2.14)

A is a constant. This is the plane wave solution of the wave equation at a given

frequency ω0. We have directly put forward the solution of wave equation in cartesian

coordinate in an introductory form for detailed description please see [9].

2.1.3 Solution of wave equation in spehrical coordinates

Now we would discuss in detail about the solution of plane wave in spherical coordinate

system as this is directly related in the processing of spherical microphone arrays.

We recall the wave equation 2.8 and again present it over here

∇2 p(x , y, z, t)− 1c2

∂ 2

∂t2 p(x , y, z, t) = 0 (2.15)



23/109


∇2 is also called as laplace operator ∆ = ∇2 and is defined in cartesian coordianatesas

∇2 ≡ ∂ 2

∂x2 + ∂

2

∂y 2 + ∂

2

∂z 2 (2.16)

The spherical coordinate system shown in figure 2.2 would be followed in this thesis

work. Looking at figure 2.2 we can express cartesian coordinates in terms of r , ϑ, ϕ.

Figure 2.2: Spherical coordinate system and its relation to Cartesian coordinate system

x = r sin ϑ cos ϕ y = r sin ϑ sin ϕ z = r cos ϑ (2.17)

Here r denotes length of vector r and the direction Ω ≡ (ϕ, ϑ) represent azimuth-elevation pair. Hence we can say r =

x2 + y2 + z 2, ϑ = tan−1

x2 + y2/z

and



24/109


ϕ = tan−1 [y/x]. Considering equation 2.15 and 2.17 we can express the wave equation

in spherical coordinates as

1r2 · ∂

∂r

r2 ∂p

∂r

+ 1

r2sinϑ · ∂

∂ϑ

sinϑ ∂p

∂ϑ

+ 1

r2sinϑ · ∂ 2

∂ϕ2 − 1

c2 · ∂ 2 p

∂t2 = 0 (2.18)

In this equation p is a variable of (r ,ϑ,ϕ, t ). The right hand side of the equation

explains the consideration that there are no source in the volume for which the equation

is defined. The solutions for this wave equation in frequency domain is explained in

[9] and is given in two forms as

p(r, Ω, k) =l=0∞

m=−ll

(Alm(k) · jl(kr) + Blm(k) · yl(kr)) Y ml (Ω) (2.19)

p(r, Ω, k) =l=0∞

m=−ll

C lm(k) · h(1)l (kr) + Dlm(k) · h(2)l (kr)

Y ml (Ω) (2.20)

The two solution represent the interior and exterior problem, equation 2.20 refers

to the exterior problem and equation 2.19 refers to the interior problem. We will

elaborate more on these two solutions and the cofficients Alm(k), Blm(k), C lm(k), and

Dlm(k) in later sections.

The level l and mode m are integers with values defined as 0 ≤ l ≤ ∞ and −l ≤ m ≤ l.The acoustical wave number as defined earlier is k = ω

c = 2πf

c , here f is the frequency of

the sound wave and c is the speed of sound in the medium. Functions jl(kr) and yl(kr)

are spherical bessel function of first and second kind respectively. Similarly h(1)l (kr)

and h(2)

l

(kr) are known as spherical hankel functions of first and second kind. Y m

l

(Ω)

is the function known as spherical harmonic of level or order l and mode m and is

defined as

Y ml (ϑ, ϕ) =

(2l + 1)

4π

(l − m)!(l + m)!

P ml (cosϑ)eimϕ (2.21)



25/109


These expressions which are a outcome of the derivation for the solution of wave

equation 2.15, are acheived by separation of variable in equation 2.18. In [9, page

186],[25, page 380] and in [20, page 337] the derivation and solutions are explained quite

nicely, for more detailed analysis of separation of variable approach used in solving the

wave equation please refer to [27].

In equation 2.21, P ml (cosϑ) is the Legendre function of the first kind and i =√ −1.

2.1.4 Spherical Bessel and Hankel functions

In [9], solution for spherical wave equation is given. We use separation of variable

approach [27] in order to solve the wave equation in spherical coordinate system, in

this process the spherical wave equation gets separated into four different differential

equations. The solutions to these four constituent differential equation gives us the

solution for wave equation in spherical coordinates. The solution of these differential

equations leads us to spherical Bessel function, Hankel function and the Legendre

polynomial which appears in the spherical harmonics.

jl(kr) and yl(kr) are related to corresponding bessel function as [28] [9] :

jl(x) ≡

π

2xJ l+1/2(x)

yl(x) ≡

π

2xY l+1/2(x)

(2.22)

The equations in 2.22 are valid for l ∈ R. The spherical Hankel function of the firstand second kind h

(1)l (x) and h

(2)l (x), are defined as

h(1)l (x) ≡ jl(x) + i · yl(x)

h(2)l (x) ≡ jl(x) − i · yl(x)(2.23)

here x is the argument, and in our case it is kr. It is seen that when x is real then

h(1)l (x) is the conjugate of h(2)l (x), in our case kr is always real, as it is the product

of wave number and the radius or the distance from the origin. h(1)l (x) ∝ eikr andh

(2)l (x) ∝ e−ikr [9], hence Hankel function of the first kind represents an outgoing wave

where as the other one represents an incoming wave, these solution are used depending



26/109


upon the location of our sources, in our case the sources lie out side the measurement

sphere (refer to the explaination in the chapter 1 about soap bubble) hence we would

be interested in the incoming wave for the analysis of our spherical microphone array.

Figure 2.3: Spherical Bessel function of the first kind jl(x) (left) and the second kindyl(x) (right) for order l ∈ {0, 3, 6} [11]

Figure 2.3 shows the behaviour of these functions for different level or order l with

respect to the argument x. Here we would like to point out few conclusions, as seen

from the plots the spherical bessel function of the first kind are finite at the origin

but then for higher orders that is l > 0, there is an initial region where the functionremains zero except for the case of j0(x), also the function of second kind experiences

a negative non fininte behaviour near the origin. Hence firstly we would like to state

the obvious that is from equation 2.23, we can say that the spherical Hankel function

are singular at x = 0. The other consequence which is of importance in latter part

of our analysis is that as the function shows depreciating behaviour near origin or in

cases where l > x, where for us x is kr, that is the product of wave number and radius

or simply we may call it a measure of frequency of the acoustic wave. We notice that

shperical wave solution gives us a kind of damped response for lower frequency regionsand in situation where we use a higher value of level l, also refered as transform order,

we loose low frequency information of the acoustic wave and in order to retrieve it we

try to amplify the signal extensively, these conclusions would again be recalled when

we talk about interior-exterior problem and radial filter components or mode strength

for rigid sphere in plane wave decomposition.



27/109


2.1.5 Legendre functions

Refer to expression 2.21, the term P ml (x) appeared in the solution for wave equation

in spherical coordinate system. This term is called Legendre function. The legendre

function for the case when m = 0 are known as Legendre polynomials, denoted by

P l(x), and are expressed by Rodrigues’ Formula as [9]:

P l(x) = 1

2ll!

dl

dxl(x2 − 1)l (2.24)

The function P ml (x) which has two indices are known as associated Legendre functions where m = 0. For positive m

P ml (x) = (−1)m(1 − x2)m/2 dm

dxmP l(x) (2.25)

and for negative m

P −ml (x) = (−1)m(l −m)!(l + m)!

P l(x) m > 0 (2.26)

The property of Legendre function which makes it attractive for us is that they form

a set of orthogonal functions for each mode m. Hence the spherical harmonics are also

a set of orthogonal functions. For further details the reader is referred to [ 9][25]

2.1.6 Spherical harmonics

Any function on a sphere could be represented by a combination of spherical harmonics[9], in our case the solution of acoustic wave equation is obtained in terms of spherical

harmonics Y ml (ϑ, ϕ) orY ml (Ω) given in equation 2.21. The spherical harmonics define

the angular components of the wave solution. Considering equation 2.26 the spherical

harmonic for negative m can be obtained from the solution for positive m given as

Y −ml (Ω) = (−1)mY ml (Ω) m > 0 (2.27)



28/109


where Y ml (Ω) is the complex conjugate of Y ml (Ω). There are 2l + 1 different spherical

harmonics for each level l as −l ≤ m ≤ l. One more property of spherical harmonicsis that they are not only orthogonal but they are orthonormal too[ 9, page 191].

S 2Y ml (Ω)Y

ml (Ω)dΩ = δ llδ mm (2.28)

here δ ll is the Kronecker delta, which is 1 for l = l and 0 otherwise. The surface

integral is defined as

S 2

dΩ =

2π0

dϕ

π0

sin ϑdϑ (2.29)

As said above the any fuction on a shpere can be decomposed into the sum of sphericalharmonics [9, page 192] , [29, page 202] .

f (Ω) =∞l=0

lm=−l

f lm(k)Y ml (Ω) (2.30)

this expression can also be termed as inverse spherical Fourier transform (ISFT) [29].

As the spherical harmonic functions are orthonormal hence we can obtain the spherical

fourier transform coefficients, given as

f lm(k) =

S 2

Y ml (Ω)f (Ω)dΩ (2.31)

The derivation for this expression can be refered in [29, page 202] and [11] in appenddix

(A.1). The importance of these expression presented above is that with the help of

these expression we obtain our spherical wave decomposition and in turn the plane

wave decomposition.

The spherical harmonic functions are further depicted in figure 2.4 for levels l∈ {

0, 1, 2, 3}

.

In the expression for spherical harmonics in equation 2.21, the Legendre function P mlrepresents standing spherical waves in ϑ and the factor eimϕ represents traveling spher-

ical waves in ϕ [17].



29/109


Figure 2.4: Spherical harmonics Y ml (Ω) for order l ∈ {0, 1, 2, 3} [11]

2.1.7 Radial Velocity

Till now we talked about the pressure field, now we would like to shed some light on the

radial velocity of the sound wave. As radial velocity in plane wave docomposition of

spherical waves would represent our directivity function hence an introduction to this

term is important before we go further with Spherical and plane wave decomposition

concepts.

Equation 2.3 is written in spherical coordinates as

∇ ≡ ∂ ∂r

er + 1

r

∂

∂ϑeϑ +

1

rsinϑ

∂

∂ϕeϕ (2.32)

where e() represents the unit vector in spherical coordinates

Refer Euler’s equation 2.1, the Fourier transform of this equation gives us

iρ0ckυ = ∇ p(x , y, z, k) (2.33)

Equation 2.33 is in cartesian coordinates.

υ = u(r, Ω, k)eϑ + υ(r, Ω, k)eϕ + w(r, Ω, k)er (2.34)



30/109


Solving these equation we obtain the expression for radial velocity component

w(r, Ω, k) = 1

iρ0ck · ∂p(r, Ω, k)

∂r (2.35)

2.2 Spherical harmonic decomposition

With the given background as described in the previous part now we talk about the

specific solutions. There were two solution to wave equation 2.18, given by equations

2.19 and 2.20. In [30] the author has explained spherical harmonic decomposition

of spherical microphone arrays, and also explained its limitations. Further in [31]

[17] they explain the theoritical analysis for plane wave decomposition (PWD) using

spherical convolution and then further explain the technique of spherical Fourier trans-

forms used for PWD. In this section we will derive expressions for spherical harmonic

decomposition and talk about various consequences which are encountered during this

part of wave field analysis.

2.2.1 Interior and Exterior problem

The solution given by equation 2.20 describes the pressure field for exterior problem

[9]. Refer to figure 2.5, all the source are inside the spherical volume defined by radius

a. As the solution is valid only for the region without any sources hence the pressure

field is described for region where r ≥ a. As all sources lie inside the measurementsphere and as per our discussion on Hankel function, we only take into consideration

the first term of equation 2.20, because the sound waves have an outgoing direction

but as there are no sources outside the region r = a hence there would be no incoming

waves and hence the second part of the solution would not be considered, therefore

our solution is

p(r, Ω, k) =∞l=0

lm=−l

(C lm(k) · h(1)l (kr) (2.36)

Now we focus more rigorously on the Interior Problem, as this is more interesting for

our work and therefore all explainations would be done with regard to interior prob-

lem. In interior problem analysis, the sound sources are located outside the spherical



31/109


Figure 2.5: Exterior problem[11]

volume and estimating the acoustic effect on the surface of this volume it is sufficient

to characterize the sound in space. Going a bit further we may say that in order to

map this surface we use spherical microphone array. Hence we say that our spherical

microphone array is enclosed by an imaginary spherical volume and at each observa-

tion point of the array we attempt to measure the acoustic effect invoked by externalsources. Figure 2.6 shows the case for interior problem, all the source are present

outside the measurement sphere r = b, the region

1 and

2 represent the sources

outside the valid region of measurement. The solution for interior problem comes from

equation 2.19, as the solution should be finite at all points within the measurement re-

gion r ≤ b, hence from our discussion on Hankel function and spherical Bessel functionin section 3.2.1 when r = 0, that is at the origin, both the spherical Hankel function

and spherical Bessel function of second kind would not be finite hence our solution

would contain only the first term of equation 2.19 and is given as [9]:

p(r, Ω, k) =∞l=0

lm=−l

(Alm(k)) · jl(kr) · Y ml (Ω) (2.37)

where p(r, Ω, k) is the sound pressure at point (r, Ω), k is the wavenumber and Alm(k)

is the coefficient of the spherical harmonics Y ml (Ω) of order l and mode m and jl(kr)

is the spherical Bessel function of first kind.



32/109


Figure 2.6: Interior problem[11]

As we defined the expression for pressure field we can also define an expression for

radial velocity w(r, Ω, k), equation 2.37 can be used in equation 2.35 and we obtain

w(r, Ω, k) = 1

icρ0

∞l=0

lm=−l

Alm(k) · jl(kr) · Y ml (Ω) (2.38)

where jl(kr) is the derivative of jl(kr) with respect to kr given as

∂jl(kr)

∂r =

∂jl(kr)

∂kr · ∂ kr

∂r = j l(kr) · k (2.39)

As we are using spherical microphone array and hence we can describe the pressure

at any point on the surface of the spherical microphone array in the same fashion as

presented in the interior problem. This would become more clear in the later sections.

2.2.2 Spherical wave spectrum

Now as defined in equation 2.37, if we can obtain the cofficient Alm(k) then we can

easily define the pressure field p(r, Ω, k). Exploiting the property of orthonormality

of spherical harmonics and the fact that any arbitrary function on a sphere can be



33/109


expanded in terms of its spherical harmonics [29] we follow the procedure as described

in Appendix A.1, on following the same treatment with equation 2.37 we obtain

Alm(k) = 1 jl(kr)

S 2

p(r, Ω, k)Y ml (Ω)dΩ (2.40)

The expression for Alm(k) is also called spherical wave spectrum as it can be regarded

as spherical Fourier transform of p(r, Ω, k) [9], also written as

P lm(r, k) = 1

jl(kr)

S 2

p(r, Ω, k)Y ml (Ω)dΩ (2.41)

P lm(r, k) describes the sound wave in frequency in terms of wave number or k-space.

2.3 Spherical wave sound fields

Before we go to next sections lets bring out some analysis as how to express a spherical

wave at a point due to some given source. Refer to figure 2.7.

Figure 2.7: Geometrical description for the calculation of pressure p(r,ϑ,ϕ,k) at pointP for source at Q



34/109


We consider a point source, also termed as a monopole at the origin O. The pressure

p(r, k) at point P is given by the expression [9, page 198]

p(r, k) = −ip0(k)ckQs eikr4πr

(2.42)

Here r is the length of the position vector r for point P , c is the speed of sound, and

k is the wave number. Qs represents the source strength. It is the amount of fluid

volume injected into the medium per unit time[9, page 198, 37]. The sound radiation

from a monopole is omnidirectional hence it is independent of angles ϑ and ϕ. p0(k)

is the magnitude of the source at origin.

Now if we want to calculate the pressure field at point P due to a source located atpoint Q then this can be done by some geometrical manupulation on equation 2.42.

Assume the same monopole to be located at Q with distance rs = rs from the origin.If we say r

s = rs then the pressure at point P due to source at Q would be quivalent

to the pressure at P due to the source at the origin O. Therefore pressure p(r, Ω, k)

at point P for a source at Q is

p(r, Ω, k) = −ip0(k)ckQseikr−rs

4πr − rs X

(2.43)

Here Ω ≡ (ϕ, ϑ). The significance of this equation is that we derived an expression forpressure field at a point on a sphere due to a source located at a position other than

the origin, that means if we try to present a analogy with spherical microphone array,

then consider the array as a spherical surface and at any point on that surface we can

describe the pressure field due to a source located at any position Q. Here one more

thing which is to be noted is the fact that as r − rs is dependent on ϕ and ϑ hencethe sound pressure in equation 2.43 is also dependent on ϕ and ϑ.

Further as derived in [9, page 198] the term X is equivalent to Green function G(r|rs)[9, page 198].



35/109


2.4 Spherical harmonic expansion of plane wave

In section 2.1.2 an expression is given to calculate the pressure of an ideal plane wavein cartesian coordinates, now we represent a similar calculation but in spherical coor-

dinates.

p(r, Ω, k) = p0(k) · ei·k·r (2.44)

where p0(k) is the magnitude of plane wave, r is the position vector (r, Ω), and k is

the wave vector. Assuming p0(k) = 1 for the purpose of derivation and using equation

2.44 in 2.37 we get

ei·k·r =∞l=0

lm=−l

(Alm(k)) · jl(kr) · Y ml (Ω) (2.45)

Here k and r are the wave vector and position vector respectively. Over here we would

like to point out that the plane wave which was described in vector domain by wave

vector and position vector in equation 2.44 is expressed in terms of wave number k

and scalar distance r. More description about this is given in the A.2.

Equation 2.45 can be further transformed as explained in [9, page 227] and is given

as

ei·k·r = 4π∞l=0

il jl(kr)l

m=−l

Y ml (Ω) · Y ml (Ω0) (2.46)

here Ω0 ≡ ϕ0, ϑ0 is the incident direction of the plane wave, where as Ω is the pointwhere we want to observe the pressure field. From equations 2.45 and 2.46 we can

draw out a conclusion that

Alm = 4π il · Y ml (Ω0) (2.47)

and we observe from this that the spherical wave cofficient Alm for plane wave sound

field are not dependent on k or frequency f of the wave.



36/109


In [11, page 18] equation 2.46 has been simulated for a plane wave sound field of 1

kHz. The simulation has been shown for different maximum value of level l and finally

it was deduced that the plane wave field can be approximated exactly only within a

bounded region around the origin and this region is bigger for higher values of l. If

we say that in equation 2.46, in place of ∞ in the first summation, we replace it by amaximum level l = L, then we can establish an approximate rule given by

d

λ =

L

2π(2.48)

here d is the radius of the region, L maximum level l and λ is the wavelength of

the plane wave. This prportionality states the fact that the region for which we can

effectively define the pressure field is proportional to the level l. For reference plotsare provided in the Appendix A.3.

2.5 Mode strength

We define an expression for the combination of Bessel function and hankel function

which have appeared in earlier sections during the derivation for cofficients Alm of

spherical harmonics. In the process of measurement of sound fields using sphericalmicrophone arrays the interaction of sound field with the array structure has to be

taken into consideration [9] [17] [31].

If we recall equation 2.37 and 2.40 and express them in a generalized form in order

to associate them to different kind of spherical microphone array structure. Then the

equations are written as

s(r, Ω, k) =∞l=0

lm=−l

(Alm(k)) · bl(kr) · Y ml (Ω) (2.49)

Alm(k) = 1

bl(kr)

S 2

s(r, Ω, k)Y ml (Ω)dΩ (2.50)



37/109


here s(r, Ω, k) is the spherical microphone array response. The term bl(kr) is called

as mode strength. For different microphone array structure the interaction of sound

fields with the array is approximated using this term [9] [32]. In general we define two

types of spherical array structures

• Open sphere configuration

• Rigid sphere configuration

In open sphere configuration we have a single microphone mounted on a robotic arm

and according to predefined microphone positions, measurements are done for respec-

tive positions on the sphere. In rigid sphere the sensors are arranged on a solid sphere.

In appendix A.4 images for open sphere and rigid sphere configuration are given.

bl(kr) =

4πil jlkr, open sphere arrays

4πil

jlkr − j

l(ka)

h(2)l

(ka)h

(2)l (kr)

, rigid sphere arrays

(2.51)

here jl(kr) is the spherical bessel function of first kind, h(2)l (kr) and h

(2)l (ka)h

(2) are

the spherical Hankel function of second kind, (·) denotes the derivative, and a is theradius of the sphere, where r ≥ a.

The rigid sphere configuration is better than the open sphere configuration [31] [17]

[32]. The major disadvantage of the rigid sphere configuration is that it interferes

or interacts with the surrounding sound fields. The mode strength does accounts

for the scattering effect caused by the rigid sphere while calculating for the incident

waves. Although the scattering effect is negligible for small spheres but it become

more prominent when a larger sphere configuration is used. Hence in case of larger

sphere, measurement should be done more accurately as the scattered waves can be

considered as additional incident waves when they get reflected by other objects in the

measurement environment and impinge on the sphere [31].

In figure 2.8 the mode strengths bl(kr) is plotted as a function of kr and for different

order l, in the figure order l are represented by the alphabet n.

The major advantage of using rigid sphere configuration is the improved numerical

conditioning as in equation 2.50 the spherical coefficient Alm contains a term 1/bl,



38/109


((a)) Rigid sphere array

((b)) Open sphere array

Figure 2.8: Mode strength for rigid sphere array and open sphere array [31]



39/109


and as bl is zero for some cases in open sphere configuration but not in the case of rigid

spheres [17] [31] [33].

2.6 Discretization of spherical aperture and spatial

aliasing

The analysis presented till now gave the description of sound field on a continuous

spherical aperture but in practice we can sample a sphere only on a finite number of

microphone positions. Hence we need to translate the expression for spherical coef-

fecients Alm(k) which are defined by the integral over a unit sphere in 2.50 into afinite summation. The approximation of finite integrals is known as quadrature and

the expression for Alm(k) in terms of finite summation is given as [8, page 43]:

Alm(k) ≈ Âlm(k) = 1bl(kr)

Qq

wq · s(r, Ω, k) · Y ml (Ω) (2.52)

where Âlm(k) is the approximated spherical coeffecient, Q is the number of microphone

positions and wq are the quadrature weights. The weights wq are the factors which are

used for compensation in different types of quadrature schemes so as to approximate

the sound field as closely as possible to the continuous aperture

Spherical microphone arrays perform spatial sampling of sound pressure defined on a

sphere and similar to time-domain sampling spatial sampling also requires to be limited

in band width i.e., limited harmonic order l to avoid aliasing [31] [34].

Hence in order to avoid spatial aliasing the following equation must hold good [8, page

44]

Alm(k) = 0,where l > Lmax (2.53)

Here Lmax is the highest order spherical coefficient of the sound field. The equation

given in 2.53 must be ensured in sampling the sphere otherwise spatial aliasing will



40/109


corrupt the coefficients at lower orders. A more detailed analysis for spatial aliasing

in spherical microphone arrays is presented in [34].

The sampling of level-limited (the word level/order are used interchangeably and referto l) sound fields can be done in many different ways as explained in [35] [31] [8]. These

quadrature allow us to perform sampling on the sphere with negligible or no aliasing

as far as equation 2.53 holds good.

Commonly there are three sampling schemes, a more detailed mathematical description

of these sampling schemes can be found in the references provide above.

1. Chebyshev quadrature, the sampling is characterized by uniform sampling in

elevation ϑ and azimuth ϕ. The total number of microphones in this scheme aregiven as Qch = 2Lmax · (2Lmax + 1)

2. In Gauss-Legendre quadrature the sphere is sampled uniformly in azimuth ϕ

but in elevation it is sampled at the zeroes of the Legendre polynomials of level

Lmax + 1. Number of microphone position required in this scheme are given as

QGL = Lmax · (2Lmax + 1)

3. Lebedev grid, in this quadrature scheme the microphone positions are uniformly

spread over the surface of the sphere such that each point has the same distanceto its nearest neighbours.

QLb = 4

3(Lmax + 1)

2 (2.54)

In this work we use Lebedev grid as it has an advantage over the other two schemes

and that is, it uses a smaller number of microphones positions for the approxima-

tion than the other two. A more detailed description of the lebedev grid is given in

[36] [37] [38] [39]. Reference [39] gives the Fortran code for calculating the grid pointsand weights for levels upto l = 131.

Using the approach of quadrature for discretization of the sphere we require a level

limited sound field in order to get an aliasing free sampling but for plane wave sound

fields the restrictions to a maximum level Lmax is not true as we can see this from

equation 2.45 and 2.46 which involve infinite number of non-zero spherical coeffecients

Alm(k). Hence some degree of spatial aliasing does occurs. But refering to section

2.2.1 we get to know that the spherical Bessel function jl(kr) decay rapidly for kr > l,



41/109


Figure 2.9: Different quadrature schemes [8]

therefore the strength of coeffecients in equation 2.45 can also be supposed to show a

similar behaviour for kr > l. Hence in theory we can say that the aliasing error would

not be there, if the operation frequency of the microphone array follows kr


42/109


w(Ω0, k) and are arriving from all the directions Ω0. Integrating equation 2.56 for all

the incident directions we have the expression for spherical fourier coeffecients f lm(k)

f lm(k) = 4πilbl(kr)

S 2

w(Ω0, k)Y ml (Ω0) (2.57)

The expression in equation 2.57 is termed as the spherical fourier transfor of amplitudes

w(Ω0, k) and we express it as wlm(k)

wlm(k) = f lm(k) 1

4πilbl(kr) (2.58)

For obtaining the amplitude ws(Ω

s, k) of any plane wave arriving from any direction

Ωs we perform an inverse SFT of equation 2.58

ws(Ωs, k) =l=0∞

m=−ll

f lm(k) · 14πilbl(kr)

.Y ml (Ωs) (2.59)

ws(Ωs, k) is also called directivity function and describes the decomposed plane wave

for a particular direction Ωs. Ωs is also known as steering direction of the microphone

array, and tells the direction for which plane wave decomposition is computed.

Further if we use equation 2.55 in equation 2.59 we get the expression for plane wave

decomposition in terms of spherical harmonic coeffecients Alm(k).

ws(Ωs, k) =l=0∞

m=−ll

1

4πil

Alm(k) · Y ml (Ωs) (2.60)

2.8 Spatial resolution in plane wave decomposition

In [17] [32] the saptial resolution of plane wave decomposition with respect to level

l has been analysed. As we can not use higher order than kr because this results

in negligible amplitudes of spherical harmonic coeffecients in lower frequency regions.

Hence our plane wave decomposition remains level limited to finite extents.



43/109


It has been shown in [17] that directivity decreases for lower values of order l and this

directivity pattern has been quantified in [17] and [11, page 39] by expression

ws(Θ) = L + 1

4π(cosΘ − 1) (P L+1(cos Θ)− P L(cos Θ)) (2.61)

Here Θ is the angle between arrival direction of plane wave Ω 0 and steering direction

of the microphone array Ωs. P L(·) is the Legendre polynomial of level l. ws(Θ) is thedirectional weight and it defines the spatial resolution for plane wave decomposition

calculated with a maximum level L. Refer figure 2.10

Figure 2.10: Directivity weights for PWD verses l [17]

In this figure the directivity weights ws(Θ) are plotted for different levels l. It can be

noticed that for the Θ = 0 i.e., when the array looks towards the arrival direction of

plane wave the directivity coeffecient (main lobe) gives a very sharp peak for higher

order and it broadens as the orders are decreased.

The spatial resolution is further defined with a relation between level l = L and the

first (smallest) zero Θ0 of ws(Θ) for Θ > 0. Θ0 is defined as the half of the resolution of

PWD. Θ0 = 180◦

L is a relation derived in [17] which tells us the extent to which plane

wave decomposition with a particular level L can decomepose a wave field in different

plane waves in spatial sense. Figure 2.11 is approximated by the relation Θ0 = 180◦

L .



44/109


Figure 2.11: Half resolution of the PWD [17]



45/109


46/109

3 ERROR ANALYSIS 39

Figure 3.1: Errors in Spherical Microphone array measurement

3.2 Description of measurement error function

In this section we follow the frame work given in [31] and describe the mesurement

errors mathematically and there contribution to spherical harmonic coeffecients.

For the analytical description we assume an arbitrary sound field is captured by an

rigid sphere microphone array. The frequency domain output of a single microphone

element which is considered to have all the errors as depicted in figure 3.1 is

s(r, Ωq, k) + eq (3.1)

where k is the wave number, r is radius of the sphere, eq is the noise introduced

by the microphones and Ωq is the microphone position with positioning errors. The

spherical harmonic coeffecients Alm(k) can be calculated by using equation 2.52 which

is explained in 2.6. Keeping the said equations in mind we obtain the following

Âlm(k) = 1

bl(kr)

q=1Q

wq · s(r, Ωq, k) · Y ml (Ω) +q=1Q

wq · eq · Y ml (Ω)

(3.2)



47/109

3 ERROR ANALYSIS 40

In this equation Q is the number of microphones, wq are the quadrature weights and

bl(kr) is the mode strangth for rigid sphere configuration refer section 2.5. The correct

microphone position as defined by the sampling scheme are denoted by Ω q. Now we

express the sound field s(r, Ωq, k) in terms of the correct spherical harmonic coeffecients

Alm(k) using equation 2.49 in section 2.5 and substituting it in equation 3.2

Âlm(k) = 1

bl(kr)

l=0∞

m=−ll

Alm(k) · bl(kr)

×

q=1Q

wq · Y ml (Ωq) · Y ml (Ωq)

X +

q=1Q

wq · eq · Y ml (Ω) (3.3)

The term X is equivalent to the orthonormality condition of spherical harmonics given

in section 2.1.6. In ?? this term has been extended to find the contibution of aliasing

error a and positioning error Ω and is expressed as

Qq=1

wq · Y ml (Ωq) · Y ml (Ωq) =δ ll · δ mm + Ω(l,m,l

, m), where l, l ≤ Lmaxa(l,m,l

, m) + Ω(l,m,l, m), where l ≤ Lmax < l

(3.4)

Here δ ll and δ mm are Kronecker deltas. The maximum level Lmax is the highest level

of the spherical harmonic coeffecients Alm(k) inside the sound field which is sampled

using Q microphone positions, the relation for L and Q could be seen in section 2.6

equation 2.54 for Lebedev grid. In the first part of equation 3.4 the level l < Lmax

hence we do not see aliasing error in that expression.

Also from Kronecker deltas we see that if Ω = 0 then Ωq and Ωq should be equal,

hence Ω represents the positioning error.

In the lower part of equation 3.4 we consider l > Lmax hence we say spatial aliasing

would be there. Since l and l are different terms δ ll · δ mm does not appears in thispart.



48/109

3 ERROR ANALYSIS 41

The aliasing error a is given as [31]

a(l,m,l, m) =

q=1Q

wq · Y ml (Ωq) · Y ml (Ωq), where l ≤ Lmax < l (3.5)

For positioning error we obtain it by subtracting equation 3.4 from equation 3.5 [31]

Ω(l,m,l, m) =

q=1

Q wq Y m

l (Ωq)

−Y m

l (Ωq)Y ml (Ωq), where l ≤ Lmax, l

≥0 (3.6)

Finally if use equation 3.4 in equation 3.3 and separate the summation over l we get

the expression for spherical harmonic coeffecients with all the error contributions [31]

Âlm(k)= 1

bl(kr)

∞l=0

lm=−l

Alm(k) · bl(kr) · δ ll · δ mm

A(s)lm(k)−signal contribution+

1

bl(kr)

∞l=0

lm=−l

Alm(k) · bl(kr) · Ω(l,m,l, m) A(Ω)lm

(k)−positioning error

+ 1

bl(kr)

∞l=0

lm=−l

Alm(k) · bl(kr) · a(l,m,l, m) A(a)lm

(k)−aliasing error

+

1

bl(kr)

Q

q=1

wq · eq · Y m

l (Ωq) A(e)lm

(k)−microphone noise

(3.7)

In equation 3.7 the first term refer to the error free contribution in spherical harmonic

coeffecients ˆAlm(k). As the Kronecker deltas would be one, hence the first term simpli-

fies to Alm(k). All the other terms represent the errors. From the equation itself we see

that the errors depend on level l, kr and the quadrature. Although we are using rigid



49/109

3 ERROR ANALYSIS 42

sphere configuration but as mode strength bl(kr) has different expression for different

microphone configuration hence the errors are also dependent on array configuration.

Finally we can obtain the expression for plane wave decomposition by using equation3.7 and substituting it in equation 2.60, we get the expression for directivity function

in plane wave decomposition. Each term A(·)lm(k) in equation 3.7 yields the contribution

of that particular error to the direction weights ws

w(·)s (Ωs, k) =∞l=0

lm=−l

1

4πil · A(·)lm(k) · Y ml (Ωs) (3.8)

where Ωs is the steering direction of spherical microphone array and A(·)lm(k) can any of

the four different components in equation 3.7; A(s)lm(k), A

(Ω)lm (k), A

(a)lm (k) or A

(e)lm(k). In

order to get the effective influence of the measurement errors on results of plane wave

decomposition we relate the error contribution in equation 3.7 to corresponding signal

contribution and we look for relative error contribution by taking ratio of the squared

absolute values of different errors with respect to signal contribution [31].

E a(kr) =

w(a)s (Ωs, k)2w(s)s (Ωs, k)2

E Ω(kr) =

w(Ω)s (Ωs, k)2w(s)s (Ωs, k)2

E e(kr) = w

(e)s (Ωs, k)

2

w(s)s (Ωs, k)2

(3.9)

Equation 3.9 Noise to signal ratios are calculated. Figure 3.2 shows the behaviour of

different errors; noise, positioning and alising, for different levels l.

On comparing various quadratures for spatial aliasing, microphone noise and posi-

tioning error the Lebedev quadrature is found to have better robustness against the

errors in general. Due to these characteristics we use Lebedev grid along with rigid

sphere [31].



50/109

3 ERROR ANALYSIS 43

Figure 3.2: Errors in Spherical Microphone array measurement [31]

3.3 Microphone noise

Microphone noise is an important source of introducing corruptive artifact in auraliza-

tion process. Although the contemporary microphone technology provides a very high

signal to noise (SNR) ratio but in general we can not disregard the noise induced by

the microphones.

It is important to note that the mode strength refer 2.5, have quite low values for

smaller values of kr at higher levels l and hence, this amplifies the spherical harmonic

coefficients (refer equation 2.52) considerably therefore in situations where noise is

present, it also gets amplified 3.2. The increase in noise is more vigorous in the low krrange than in the higher kr

Microphone noise also depends on the number of microphone used, it was shown in

simulations in [31] that higher the number of microphone the better is its robustness

against noise. It is also seen that the influence of noise is the lowest when the maximum

level l ≈ kr. For higher kr the mode strengths some what converges towards 0 dBand hence we can say theoritically the increase in error for higher kr should not be

too significant. The quadratures used for discretization of the sphere do not have any



51/109

3 ERROR ANALYSIS 44

significant affect in regard to the microphone noise and they all behave in a similar

way. But as the noise affect is more at low kr and hence, we can say that it limits the

array performance on lower frequencies.

3.4 Spatial aliasing

The problem of spatial aliasing is quite complex in spherical microphone arrays. As

continuous aperture is not practically feasible hence we do discretization of the sphere,

using quadratures, which gives us a relation between number of microphones and the

maximum level l. But this discretization of the sphere leads us to spatial aliasing

problem. In [31] [34] [40] analysis of sampling techniques for spherical microphone

arrays and its effect on plane wave decomposition is explained. Aliasing free techniques

for level limited functions and some solution like spatial anti aliasing filters for aliasing

reduction are proposed in [34]

Refer to figure 2.8(a), because of the nature of bl(kr) the magnitude of spherical har-

monic coeffecients of the sound pressure becomes increasingly insignificant for l > kr,

r is radius of sphere. The aliasing error is expected to be almost negligible if operating

frequency range of array satisfies the condition kr


52/109

3 ERROR ANALYSIS 45

various sampling techniques, our work is based on Lebedev grid which is given in