Wave-Based Sound Propagation for VR...

6
Wave-Based Sound Propagation for VR Applications Ravish Mehra University of North Carolina at Chapel Hill Dinesh Manocha University of North Carolina at Chapel Hill Figure 1: The equivalent source technique accurately models realistic acoustic effects, such as diffraction, scattering, focusing, and echoes, in large, open scenes. It reduces the runtime memory usage by orders of magnitude compared to state-of-the-art wave solvers, enabling real-time, wave-based sound propagation in scenes spanning hundreds of meters: a) reservoir scene (Half-Life 2), b) Christmas scene, and c) desert scene. Abstract Realistic sound effects are extremely important in VR to improve the sense of realism and immersion. It augments the visual sense of the user and can help reduce simulation fatigue. Sound can provide 3D spatial cues outside field of view and help create high-fidelity VR training simula- tions. Current sound propagation techniques are based on heuristic approaches or simple line of sight-based geomet- ric techniques. These techniques cannot capture important sound effects such as diffraction, interference, focusing. For VR applications, there is a need for high-fidelity, accurate sound propagation. In order to model sound propagation accurately, it is important to develop interactive wave-based propagation techniques. We present a set of efficient ap- proaches to model wave-based sound propagation for VR applications that can handle large scenes, directional sound sources, and generate spatial sound, for a moving listener. Our technique has been integrated in Valve’s Source game engine and we use it to demonstrate realistic acoustic ef- fects such as diffraction, high-order reflection, interference, directivity, and spatial sound, in complex scenarios. 1 Introduction Sound is ubiquitous in the physical world and forms the basic medium of human communication (language) and of human artistic expression (music). It is a pressure wave, consisting of frequencies within the range of human hearing, produced by vibration of a surface and transmitted through a medium. Sound propagation predicts the behavior of sound waves as they are emitted by the source, interact with the environ- ment, and reach the listener. Mathematically, the process of sound propagation can be expressed as a second-order par- tial differential equation called the acoustic wave equation in the time-domain and Helmholtz equation in the frequency- domain. Over the years, sound propagation has emerged as a powerful tool to enhance the realism, sense of presence and immersion of virtual environments. It augments the vi- sual sense of the user increasing situational awareness and localization abilities [Blauert 1983]. Studies in audio-visual cross-modal perception have shown that high quality sound rendering can increase the quality perception of visual ren- dering [Storms 2000]. Existing sound propagation techniques can be broadly classified into three categories - heuristic, geometric, and wave-based techniques. Typical heuristic techniques include parametric reverb filters designed by artists to describe the overall acoustics of the space such as loudness, reverbera- tion, etc. These filters are manually-designed during the preprocessing stage and then applied at runtime. Geometric techniques assume rectilinear propagation of sound waves (ray-like behavior), trace rays or beams from each source, and accumulate contributions at the listener position. This assumption is valid for high frequencies of sound and there- fore it is hard to model wave-effects such as diffraction, in- terference, etc., using geometric techniques. Wave-based techniques numerically solve the acoustic wave equation and can accurately perform sound propagation for all frequen- cies. However, these techniques are computationally expen- sive and have high runtime memory requirements. In recent years, sound propagation techniques have been employed for interactive applications in diverse set of fields including vir- tual reality, computer gaming, and acoustics. VR simulations have wide-applications in training and emergency operations. These simulations are used to treat post-traumatic stress disorder (PTSD) in war veterans [Rizzo 2007] and training medical personnel for emergency scenarios. Realistic sound effects can be extremely useful for VR simulations to improve the sense of realism and im- mersion of the virtual environment [Begault 1994]. Sound augments the visual sense of the user increasing situational awareness and can help reduce simulation fatigue. Current sound propagation used in most VR systems are based on either heuristic-based or simple line of sight-based geometric techniques [12]. These techniques cannot capture important acoustic effects such as diffraction, interference, scattering, and sound focusing. Therefore, to accurately model realis- tic acoustic effects in general scenarios, it is important to develop interactive wave-based propagation techniques that can directly solve the wave equation. The state-of-the-art in wave-based sound propagation is

Transcript of Wave-Based Sound Propagation for VR...

Wave-Based Sound Propagation for VR Applications

Ravish MehraUniversity of North Carolina at Chapel Hill

Dinesh ManochaUniversity of North Carolina at Chapel Hill

Figure 1: The equivalent source technique accurately models realistic acoustic effects, such as diffraction, scattering, focusing, and echoes, inlarge, open scenes. It reduces the runtime memory usage by orders of magnitude compared to state-of-the-art wave solvers, enabling real-time,wave-based sound propagation in scenes spanning hundreds of meters: a) reservoir scene (Half-Life 2), b) Christmas scene, and c) desert scene.

Abstract

Realistic sound effects are extremely important in VR toimprove the sense of realism and immersion. It augmentsthe visual sense of the user and can help reduce simulationfatigue. Sound can provide 3D spatial cues outside fieldof view and help create high-fidelity VR training simula-tions. Current sound propagation techniques are based onheuristic approaches or simple line of sight-based geomet-ric techniques. These techniques cannot capture importantsound effects such as diffraction, interference, focusing. ForVR applications, there is a need for high-fidelity, accuratesound propagation. In order to model sound propagationaccurately, it is important to develop interactive wave-basedpropagation techniques. We present a set of efficient ap-proaches to model wave-based sound propagation for VRapplications that can handle large scenes, directional soundsources, and generate spatial sound, for a moving listener.Our technique has been integrated in Valve’s Source gameengine and we use it to demonstrate realistic acoustic ef-fects such as diffraction, high-order reflection, interference,directivity, and spatial sound, in complex scenarios.

1 Introduction

Sound is ubiquitous in the physical world and forms the basicmedium of human communication (language) and of humanartistic expression (music). It is a pressure wave, consistingof frequencies within the range of human hearing, producedby vibration of a surface and transmitted through a medium.Sound propagation predicts the behavior of sound waves asthey are emitted by the source, interact with the environ-ment, and reach the listener. Mathematically, the process ofsound propagation can be expressed as a second-order par-tial differential equation called the acoustic wave equation inthe time-domain and Helmholtz equation in the frequency-domain. Over the years, sound propagation has emerged asa powerful tool to enhance the realism, sense of presenceand immersion of virtual environments. It augments the vi-sual sense of the user increasing situational awareness and

localization abilities [Blauert 1983]. Studies in audio-visualcross-modal perception have shown that high quality soundrendering can increase the quality perception of visual ren-dering [Storms 2000].

Existing sound propagation techniques can be broadlyclassified into three categories - heuristic, geometric, andwave-based techniques. Typical heuristic techniques includeparametric reverb filters designed by artists to describe theoverall acoustics of the space such as loudness, reverbera-tion, etc. These filters are manually-designed during thepreprocessing stage and then applied at runtime. Geometrictechniques assume rectilinear propagation of sound waves(ray-like behavior), trace rays or beams from each source,and accumulate contributions at the listener position. Thisassumption is valid for high frequencies of sound and there-fore it is hard to model wave-effects such as diffraction, in-terference, etc., using geometric techniques. Wave-basedtechniques numerically solve the acoustic wave equation andcan accurately perform sound propagation for all frequen-cies. However, these techniques are computationally expen-sive and have high runtime memory requirements. In recentyears, sound propagation techniques have been employed forinteractive applications in diverse set of fields including vir-tual reality, computer gaming, and acoustics.

VR simulations have wide-applications in training andemergency operations. These simulations are used to treatpost-traumatic stress disorder (PTSD) in war veterans[Rizzo 2007] and training medical personnel for emergencyscenarios. Realistic sound effects can be extremely usefulfor VR simulations to improve the sense of realism and im-mersion of the virtual environment [Begault 1994]. Soundaugments the visual sense of the user increasing situationalawareness and can help reduce simulation fatigue. Currentsound propagation used in most VR systems are based oneither heuristic-based or simple line of sight-based geometrictechniques [12]. These techniques cannot capture importantacoustic effects such as diffraction, interference, scattering,and sound focusing. Therefore, to accurately model realis-tic acoustic effects in general scenarios, it is important todevelop interactive wave-based propagation techniques thatcan directly solve the wave equation.

The state-of-the-art in wave-based sound propagation is

mostly limited to offline techniques. These include stan-dard numerical solvers such as finite difference time do-main method (FDTD), boundary element method (BEM),and finite element method (FEM). Due to the computa-tional and memory requirements, these techniques are lim-ited to small acoustic spaces. Recently, there has been somework in developing interactive wave-based techniques forfree-space sound radiation [13], first order scattering fromsurfaces [34], and sound propagation for small-to-mediumsized scenes [28, 26]. However, performing wave-based soundpropagation for VR applications still faces many challenges:

• Large environmentsTypical scenarios in VR simulation, acoustics, and gam-ing consist of large scenes. Current, interactive wave-based propagation techniques are computationally ex-pensive and limited to small spaces.

• Directional sourcesMost sound sources we encounter in the real-world aredirectional sources, for e.g., human voice, speakers, andmusical instruments. But current techniques can onlymodel directivity during the precomputation stage. Asa result, the directivity gets baked into the final solu-tion, and it is not possible to modify the directivitypattern at runtime, for e.g. rotating siren or a personcovering his/her mouth.

• Spatial soundAccurate computation of spatial sound correspondingto listener’s motion and head rotation is a challeng-ing task. Prior wave-based techniques for computingspatial sound are computationally expensive and usedfor offline applications. Interactive techniques resortto coarse approximations based on simplified modelswhich are not accurate for sound localization and ex-ternalization, both of which are necessary for immersionin virtual environments [Begault].

In this work, we develop a set of efficient techniquesfor performing wave-based sound propagation, in bothfrequency-and time-domain.

• Equivalent Source MethodThis is a novel approach for wave-based sound propa-gation in frequency-domain that can handle large en-vironments spanning hundreds of meters, with a smallmemory footprint. This technique can simulate com-plex acoustic effects such as diffraction, low-passing be-hind obstacles, focusing, scattering, and high-order re-flections.

• Directional Sources and Spatial SoundBased on spherical harmonics, we propose a genericframework to model directional sources and computespatial sound for any wave-based sound propagationtechnique in frequency domain. The source directivitycan be analytical, data-driven, rotating or time-varyingfunction. The spatial sound framework supports lis-tener’s motion and head rotation, and allows the use ofpersonalized HRTF.

• Parallel Adaptive Rectangular DecompositionThis approach is an efficient technique to compute time-domain solution of the acoustic wave equation. By care-fully mapping all the components of the adaptive rect-angular decomposition algorithm (ARD) on the graph-ics processors, significant improvement in performanceis gained.

2 Related Work

2.1 Acoustic wave equation

Sound propagation algorithms predicts the behavior ofsound waves as they interact with the environment. Thephysics of sound propagation can be described by the well-known time-domain formulation of the acoustic wave equa-tion:

∂2p

∂t2− c2∇2p = f (x, t) . (1)

The wave equation models sound waves as a time-varyingpressure field, p (x, t). While the speed of sound in air (de-noted c) exhibits fluctuations due to variations in tempera-ture and humidity, we ignore the acoustic effects of such fluc-tuations i.e. we assume homogenous media. Sound sourcesin the scene are modeled by the forcing field denoted f (x, t)on the right hand side in the Equation 1. The operator

∇2 = ∂2

∂x2 + ∂2

∂y2 + ∂2

∂z2is the Laplacian in 3D. The wave

equation succinctly captures wave phenomena such as inter-ference and diffraction that are observed in reality.

In frequency-domain, sound propagation can be expressedas a boundary value problem for the Helmholtz equation:

∇2P +w2

c2P = 0 in A+, (2)

where P (x, ω) is the (complex-valued) pressure field, ω isthe angular frequency, and A+ is the acoustic domain. Atthe boundary of the domain, ∂A, the pressure is specifiedusing a Dirichlet boundary condition:

p = f(x) on ∂A. (3)

To complete the problem specification, the behavior of P atinfinity must be specified, usually by the Sommerfeld radia-tion condition:

limr→∞

[∂P

∂r+ j

w

cP

]= 0, (4)

where r = ‖x‖ is the distance of point x from the origin and

j =√−1.

2.2 Geometric Techniques

Most current sound propagation systems for interactive ap-plications are based on geometric techniques, which appliesthe high-frequency Eikonal (ray) approximation to soundpropagation. The image source method [2] is the most com-monly used geometric technique, and there has been muchresearch on improving its performance [8]. However, theimage source method can only model purely specular reflec-tions. Other techniques based on ray tracing [15] or radios-ity [36] have been developed for modeling diffuse reflections,but these energy-based formulations may not model phaseaccurately. Techniques based on acoustic radiance trans-fer [29] can model arbitrary surface interactions for wide-band signals, but cannot accurately model wave phenomenasuch as diffraction. The two main approaches for modelingdiffraction in a geometric acoustics framework are the uni-form theory of diffraction (UTD) [35] and the Biot-Tolstoy-Medwin (BTM) formulation [30]. UTD is an approximateformulation, while the BTM is an offline technique thatyields accurate results with a significant performance cost.Methods based on image source gradients [33] and acousticradiance transfer operators [3] have been developed to in-teractively model higher-order propagation effects. Recentdevelopments in fast ray tracing have enabled interactive ge-ometric propagation in dynamic scenes, but these methodsonly model first-order edge diffraction based on UTD [31].

2.3 Wave-based Techniques

Wave solvers can be classified into frequency-domain andtime-domain approaches. The most common amongfrequency-domain techniques include the finite elementmethod (FEM) [32] and the boundary element method(BEM) [6]. BEM expresses the global acoustic field as thesum of elementary radiating fields from monopole and dipolesources placed on a uniform, sub-wavelength sampling of thescene’s surface. Traditional BEM scales as the square of thesurface area but recent research on the fast multipole methodfor BEM (FMM-BEM)[10] has improved the complexity tolinear in surface area by creating a hierarchical clustering ofBEM monopoles and dipoles using an octree, and approx-imating their interactions compactly using high-order mul-tipole Green’s functions. Offline FMM-BEM solutions areinfeasible for interactive applications because of the large,dense number of monopole and dipole sources in the finalsolution that need to be stored and summed on the fly.

In time-domain, the most popular is the finite differencetime domain (FDTD) method [27], which require a dis-cretization of the entire volume of the 3D scene. The com-pute and memory usages of these methods scale linearly withthe volume of the scene. Faster methods like pseudospectraltime domain (PSTD) [16] and adaptive rectangular decom-position (ARD) [25] have been proposed and achieve goodaccuracy with a much coarser spatial discretization.

Interactive techniques In recent years, we have seenincreasing interest in developing interactive wave-simulationtechniques for sound propagation in indoor and outdoorspaces. In frequency-domain techniques, sound radiationfrom a single vibrating object in free space can be effi-ciently modeled using precomputed acoustic transfer [13].These acoustic transfer functions approximate the radia-tion behavior of a complicated geometry by expressing itin terms of equivalent sources, which can be quickly evalu-ated at runtime to enable real-time performance. Tsingoset al. [34] solves the boundary integral formulation of theHelmholtz equation subject to the Kirchhoff approximationin real-time. In case of time-domain techniques, Raghuvan-shi et al. [26] relies on a volumetric sampling of acousticresponses on a spatial grid and perceptual encoding basedon the acoustic properties of indoor spaces. Recent work [28]has shown that FDTD simulations can run in real-time onthe GPU, but only for very small spaces that span a fewmeters across.

2.4 Directional sources

Real-world sound sources have a characteristic directivitythat vary with frequency [21]. Source directivity has a sig-nificant effect on the propagation of sound in an environ-ment [37]. Meyer et al. [21] measure the directivity of brass,woodwind and string instruments in an anechoic chamber.Interactive GA techniques can incorporate high-frequencysource directivities at runtime [22]. These methods essen-tially involve enumerating sound propagation paths from thesource to the listener. The directions of rays emitted fromthe source (received at the listener) can be used to applyattenuation to the corresponding propagation paths basedon any arbitrary source (listener) directivity. However, thegeometric techniques are not accurate in the low frequencyrange (e.g. 20-1000Hz), as sound waves can diffract (bend)around obstacles and undergo interference and other waveeffects. Interactive wave-based sound propagation tech-niques [26] can handle elementary directional sources such asmonopoles, dipoles, quadrupoles, and their linear combina-tions. Other techniques have been proposed to incorporate

measured directivities in wave-based techniques [11]. Butthe source directivity is modeled during the offline simula-tion stage and its directivity effects on the sound propagationgets baked into the simulation results. Therefore, a sourcewhich at runtime either rotates or has a time-varying di-rectivity, cannot modeled by current interactive wave-basedtechniques.

2.5 Spatial Sound

The human auditory system obtains significant directionalcues from the subtle differences in sound received by eachear, caused by the scattering of sound around the head [4].These effects are represented using a head-related transferfunction (HRTF). Measurements to compute HRTF are per-formed in controlled environments and the recorded datais available online [1]. Interactive GA techniques can in-corporate high-frequency HRTF-based listener directivity atruntime. However, handling listener directivity effects atlow frequency arising due to the wave nature of sound (e.g.diffraction, interference) remains a significant challenge withinteractive GA techniques. Integrating HRTFs into wave-based techniques involves computation of propagation direc-tions using plane wave decomposition. Prior plane-wave de-composition techniques either use spherical convolution [24]or solve a linear system [39], and are computationally ex-pensive. Interactive wave-based techniques resort to simplerlistener directivity models based on a spherical head anda cardioid function [26]. However, these simplified modelsare not accurate for sound localization and externalization,both of which are necessary for immersion in virtual envi-ronments [4].

3 Wave-based Sound Propagation in Frequency-Domain

In this section, we introduce a novel, frequency-domaintechnique based on equivalent sources to model wave-basedsound propagation in large environments for interactive ap-plications [19]. This technique can handle static sourcesand dynamic listeners at runtime. Next, we extend thistechnique to model dynamic sources for interactive applica-tions. We propose a general framework to handle dynamicsource directivity and compute spatial sound for any offlineor interactive wave-based sound propagation technique infrequency-domain.

3.1 Large environments

Large, open scenes spanning hundreds of meters, which arisein many applications ranging from games to training or simu-lation systems, present a significant challenge for interactive,wave-based sound propagation techniques. State-of-the-artwave simulation methods can take hours of computationand gigabytes of memory for performing sound propagationin indoor scenes such as concert halls [27, 25]. For large,open scenes, it is challenging to run these techniques in real-time. On the other hand, geometric (ray-based) acoustictechniques can provide real-time performance for such envi-ronments. However, geometric techniques are better suitedfor higher frequencies due to the inherent assumption of rec-tilinear propagation of sound waves. Therefore, accuratelymodeling diffraction and higher-order wave effects with thesetechniques remains a significant challenge, especially at lowfrequencies.

This work presents a novel approach for precomputed,wave-based sound propagation that is applicable to large,open scenes. It is based on the equivalent sources, whichare widely studied for radiation and scattering problems in

acoustics and electromagnetics [7] and more recently intro-duced to computer graphics [13]. The approach consistsof two main stages: preprocessing and runtime. Duringpreprocessing, the scene is decomposed into disjoint, well-separated rigid objects. The acoustic behavior of each ob-ject, taken independently, is characterized by its per-objecttransfer function that maps an arbitrary incident field onthe object to the resulting scattered field. We propose anequivalent source formulation to express this transfer func-tion as a compact scattering matrix. Pairwise acoustic cou-pling between objects is then modeled by computing inter-object transfer functions between all pairs of objects thatmaps the outgoing scattered field from one object to the in-coming field on another object. These transfer functions arerepresented compactly by using the same equivalent sourceframework to yield interaction matrices. Acoustic transferbetween multiple objects can therefore be represented us-ing chained multiplication of their scattering and interac-tion matrices. Finally, the acoustic response of the scene toa static source distribution is computed by solving a globallinear system that accounts for all orders of inter-object wavepropagation.

At runtime, fast summation over all outgoing equivalentsources for all objects is performed at the listener location.The computed response is used for real-time sound render-ing for a moving listener. Multiple moving sources, with astatic listener, are handled by exploiting acoustic reciprocity.The runtime memory and computational requirements areproportional to the number of objects and their outgoingscattered field complexity (usually a few thousand equiva-lent sources per frequency for a few percent error), insteadof the volume or surface area of the scene. Thus, this tech-nique takes an object-centric approach to wave-based soundpropagation. The key contributions of our work include:

• Object-based sound field decomposition using per-objectand inter-object acoustic transfer functions for enablingreal-time, wave-based sound propagation on large, openscenes.

• Compact per-object transfer using equivalent sources tomodel the scattering behavior of an object mappingarbitrary incident fields to the resultant scattered fields.

• Compact analytical coupling of objects is achieved byexpressing inter-object transfer functions in the same,compact equivalent source basis as used for per-objecttransfer.

• A fast, memory-efficient run-time enables real-timesound rendering, while requiring only a few tens ofmegabytes of memory.

This technique is well-suited for quick iterations while au-thoring scenes. Per-object transfer functions, which takea significant portion of the precomputation time of ourmethod, are independent of the scene and can thus be storedin a lookup table. Therefore, adding, deleting or movinga few objects in an existing scene has low precomputationoverhead, linear in the number of objects. We have testedthe technique on a variety of scenarios and integrated thesystem with the Valve’s SourceTM game engine from Half-Life 2. The technique generates realistic acoustic effects andtakes orders of magnitude less runtime memory comparedto state-of-the-art wave solvers, enabling interactive perfor-mance.

3.2 Directional Sources and Spatial Sound

Most sound sources we come across in real life, ranging fromhuman voices through speaker systems, machine noises, andmusical instruments, are directional sources that have a spe-cific directivity pattern [14, 21]. This directivity depends onthe shape, size, and material properties of the sound source,as well as a complex interaction of the processes of vibrationand sound radiation, resulting in varying directivity at differ-ent frequencies. Source directivity has a significant impacton the propagation of sound and the corresponding acousticsof the environments [37] that is noticeable in everyday life:a person talking towards/away from a listener, the position-ing of different types of musical instruments in an orches-tra [21], and good-sounding places (sweet spots) in front ofthe television in the living room. Analogous to directionalsources, listeners also do not receive sound equally from alldirections. The human auditory system obtains significantdirectional cues from the subtle differences in the sound re-ceived at the left and right ear, caused by the scattering ofsound around the head. This listener directivity is encodedas the head-related transfer function (HRTF). Spatial soundbased on listener directivity can be used to enhance a user’simmersion in the virtual environment by providing binau-ral cues corresponding to the direction the sound is comingfrom, thereby enriching the experience [4].

Existing sound propagation techniques, broadly classi-fied into geometric and wave-based techniques, for han-dling dynamic source and listener directivity have manylimitations. Geometric techniques can easily handle high-frequency source and listener directivity for offline and in-teractive applications [9]. However, due to the inherent as-sumption of rectilinear propagation i.e. sound waves travelas rays, in geometric techniques, the modeling wave ef-fects, such as diffraction and interference, at low frequen-cies remains a significant challenge. This becomes a limit-ing factor for low-frequency directional sources (e.g. humanvoices) and low-frequency listener directivity effects (e.g.diffraction around the head). Wave-based techniques canaccurately perform sound propagation at low frequencies,but their computational complexity increases significantlyfor high frequencies. Current interactive wave-based tech-niques [13, 26, 19] have a high precomputation overhead andcan only model source directivity during the offline compu-tation stage. As a result, the source directivity gets baked(precomputed and stored) into the final solution, and it isnot possible to modify the directivity pattern at runtime forinteractive applications e.g. a rotating siren or a person cov-ering his/her mouth. Additionally, integrating listener direc-tivity into wave-based techniques requires a plane-wave de-composition of the sound field [39]. Previous techniques forperforming plane-wave decomposition [23, 24, 39] are com-putationally expensive and limited to offline applications.Recently, physically-based methods have been proposed forsound radiation from directional sources such as fluids [38]and thin shells [5]. However, appropriate sound propagationtechniques are needed to generate environmental acoustic ef-fects arising from the interaction of radiated sound field withother objects in the scene.

In this work, we address the problem of incorporating dy-namic source directivity and spatial sound in interactive,wave-based sound propagation techniques [18]. Given ascene and a source position, a set of pressure fields is precom-puted due to elementary spherical harmonic (SH) sources us-ing a frequency-domain, wave-based sound propagation tech-nique. Next, these pressure fields are encoded in basis func-tions (e.g. multipoles) and stored for runtime use. Given the

dynamic source directivity at runtime, a SH decompositionof the directivity is performed to compute the correspond-ing SH coefficients. The final pressure field is computedas a summation of the pressure fields due to SH sourcesevaluated at the listener position weighted by the appropri-ate SH coefficients. In order to compute spatial sound forwave-based techniques, an interactive plane-wave decompo-sition approach is proposed based on derivatives of the pres-sure field. Acoustic responses for both ears are computedat runtime by using this efficient plane-wave decompositionof the pressure field and the HRTF-based listener directiv-ity. These binaural acoustic responses are convolved withthe (dry) audio to compute the spatial sound at the listenerposition.

Main Results: The main contributions are:

• Dynamic, data-driven source directivity modifiable atruntime. The propagated sound fields due to SHsources are precomputed, stored and used at runtimeto compute the sound field due to a directional source.

• Efficient plane-wave decomposition based on pressurederivatives that enables dynamic HRTF-based listenerdirectivity at runtime. The formulation is applicable tointeractive applications and supports listener head rota-tion and allows the use of personalized HRTFs withoutrecomputing the sound pressure fields.

• General framework to integrate our source and listenerdirectivities into any offline or interactive frequency-domain wave-based propagation algorithm.

• Real-time, memory-efficient sound rendering system.We demonstrate realistic acoustic effects from direc-tional sources and listener in complex scenarios.

We have demonstrated that this directivity and spatialsound framework works for both offline and interactive wave-based sound propagation techniques by incorporating it intothe state-of-the-art offline boundary element method [17]and the interactive equivalent source technique [19]. Theruntime system has been integrated with Valve’s SourceTM

game engine. Acoustic effects from both source and listenerdirectivity are demonstrated on a variety of scenarios, suchas people talking on the street, loudspeakers between build-ings, a television in a living room, a helicopter in a rockyoutdoor terrain, a bell tower in a snow-covered town, a ro-tating siren, and musical instruments in an amphitheater.We also validate the results with the ground-truth responsescomputed analytically using the offline Biot-Tolstoy-Medwintechnique [30]. This method enables accurate wave-basedsound propagation for dynamic source and listener directiv-ities and can handle moving directional sources and a movingdirectional listener in interactive applications.

4 Wave-based Sound Propagation in Time-Domain

In this section, we present a fast and efficient wave-basedsound propagation technique that solves the acoustic waveequation in time-domain, entirely on the GPU [20].

4.1 GPU-parallelization

The computational modeling and simulation of acousticspaces is fundamental to many areas such as room acous-tics, outdoor acoustics, and noise modeling. The goal ofcomputational acoustic methods is to solve the acoustic waveequation in an efficient manner. One of the key challenges insolving the wave-equation is the computational and memoryrequirements of an accurate solver. Some of the widely used

techniques such as finite difference method are mostly lim-ited to low-frequency sound propagation, as higher frequencysimulation requires PetaFLOP or ExaFLOP capabilities andterabytes of memory.

This work presents an efficient technique for time-domainsolution of the acoustic wave equation. It is based on theadaptive rectangular decomposition (ARD) of the scene anduses parallel processing of modern day graphics processors(GPUs) to achieve high computational throughput and per-formance. This technique is suitable even on coarse meshesapproaching the Nyquist limit. It is demonstrated that bycarefully mapping all the components of the algorithm tomatch the parallel processing capabilities of GPUs, signif-icant improvement in performance, which maintaining nu-merical accuracy. The key contributions of this work in-clude:

• Exploiting two-levels of parallelism exhibited by the un-derlying numerical technique.

• Avoiding host-device data transfer bottleneck by paral-lelizing all the steps of the algorithm on the GPU.

• Computationally optimal decomposition that splits thescene in partitions whose size are powers of 2 achievingbetter performance.

The main result of the utilization of the GPU architec-ture in combination with an efficient parallel technique, toallow for numerical wave simulation in the medium to highfrequency range that was earlier extremely slow on desk-top computer. Running on current generation GPUs, thistechnique can yield a speedup of up to 25 times over the op-timized CPU-based technique. This GPU-based wave solveris more than three orders of magnitude faster compared toa high-order CPU-based finite difference solver. The perfor-mance of this solver scales linearly with the number of GPUprocessors. Using this technique, a 1 s long wave simula-tion can be performed on scenes of air volume 7500 m3 till1650 Hz within 18 minutes compared to the correspondingCPU-based solver that takes around 5 hours and a high-order finite difference time-domain solver that could take upto three weeks on a desktop computer.

5 Conclusion

We have presented a novel technique based on the equiv-alent source formulation to handle large environments,spanning hundreds of meters, with a small memory foot-print. It is three orders of magnitude more memory effi-cient than the state-of-the-art wave-based propagation tech-niques. We have also proposed a general framework to in-corporate source directivity and compute spatial sound forany frequency-domain, wave-based propagation technique.And finally, we presented an efficient, GPU-based techniqueto solve the acoustic wave equation in time-domain. It en-ables wave simulation in medium to high frequency rangefor large scenes on desktop computer. These techniques havepushed the state-of-the-art in wave-based sound propagationfor generating realistic sound effects in VR.

References

[1] V. Algazi, R. Duda, D. Thompson, and C. Avendano. TheCIPIC HRTF database. In Applications of Signal Processingto Audio and Acoustics, 2001 IEEE Workshop on the, pages99 –102, 2001.

[2] J. B. Allen and D. A. Berkley. Image method for efficientlysimulating small-room acoustics. The Journal of the Acous-tical Society of America, 65(4):943–950, 1979.

[3] L. Antani, A. Chandak, M. Taylor, and D. Manocha. Direct-to-indirect acoustic radiance transfer. IEEE Trans. Visual-ization and Computer Graphics, 18(2):261–269, 2012.

[4] D. R. Begault. 3D Sound for Virtual Reality and Multimedia.Academic Press, 1994.

[5] J. N. Chadwick, S. S. An, and D. L. James. Harmonic shells:a practical nonlinear sound model for near-rigid thin shells.NY, USA, 2009. ACM.

[6] A. Cheng and D. Cheng. Heritage and early history ofthe boundary element method. Engineering Analysis withBoundary Elements, 29(3):268–302, Mar. 2005.

[7] A. Doicu, Y. A. Eremin, and T. Wriedt. Acoustic andElectromagnetic Scattering Analysis Using Discrete Sources.Academic Press, 1st edition, July 2000.

[8] T. Funkhouser, I. Carlbom, G. Elko, G. Pingali, M. Sondhi,and J. West. A beam tracing approach to acoustic modelingfor interactive virtual environments. In ACM SIGGRAPH,pages 21–32, 1998.

[9] T. Funkhouser, N. Tsingos, and J.-M. Jot. Survey of meth-ods for modeling sound propagation in interactive virtualenvironment systems. Presence and Teleoperation, 2003.

[10] N. A. Gumerov and R. Duraiswami. A broadband fast mul-tipole accelerated boundary element method for the threedimensional Helmholtz equation. The Journal of the Acous-tical Society of America, 125(1):191–205, 2009.

[11] H. Hacihabiboglu, B. Gunel, and A. Kondoz. Time-domainsimulation of directive sources in 3-d digital waveguide mesh-based acoustical models. Audio, Speech, and Language Pro-cessing, IEEE Transactions on, 16(5):934–946, 2008.

[12] J. Huopaniemi, L. Savioja, and T. Takala. Diva virtual au-dio reality system. In Proc. Int. Conf. Auditory Display(ICAD96, pages 111–116, 1996.

[13] D. L. James, J. Barbic, and D. K. Pai. Precomputed acoustictransfer: output-sensitive, accurate sound generation for geo-metrically complex vibration sources. In ACM SIGGRAPH2006 Papers, SIGGRAPH ’06, pages 987–995, New York,NY, USA, 2006. ACM.

[14] H. Jers. Directivity of singers. The Journal of the AcousticalSociety of America, 118(3):2008–2008, 2005.

[15] T. Lentz, D. Schroder, M. Vorlander, and I. Assenmacher.Virtual reality system with integrated sound field simula-tion and reproduction. EURASIP J. Appl. Signal Process.,2007(1):187, 2007.

[16] Q. H. Liu. The PSTD algorithm: A time-domainmethod combining the pseudospectral technique and per-fectly matched layers. The Journal of the Acoustical Societyof America, 101(5):3182, 1997.

[17] Y. Liu. Fast Multipole Boundary Element Method: The-ory and Applications in Engineering. Cambridge UniversityPress, 2009.

[18] R. Mehra, L. Antani, S. Kim, and D. Manocha. Source andlistener directivity for interactive wave-based sound propa-gation. IEEE Transactions on Visualization and ComputerGraphics (to appear), 19(4):567–575, 2014.

[19] R. Mehra, N. Raghuvanshi, L. Antani, A. Chandak, S. Cur-tis, and D. Manocha. Wave-based sound propagation in largeopen scenes using an equivalent source formulation. ACMTrans. Graph., Apr. 2013.

[20] R. Mehra, N. Raghuvanshi, L. Savioja, M. C. Lin, andD. Manocha. An efficient gpu-based time domain solver forthe acoustic wave equation. Applied Acoustics, 73(2):83 –94, 2012.

[21] J. Meyer and U. Hansen. Acoustics and the Performanceof Music (Fifth edition). Lecture Notes in Mathematics.Springer, 2009.

[22] F. Otondo and J. H. Rindel. The influence of the directivityof musical instruments in a room. Acta Acustica, 90(6):1178–1184, 2004.

[23] M. Park and B. Rafaely. Sound-field analysis by plane-wavedecomposition using spherical microphone array. The Jour-

nal of the Acoustical Society of America, 118(5):3094–3103,2005.

[24] B. Rafaely and A. Avni. Interaural cross correlation in asound field represented by spherical harmonics. The Journalof the Acoustical Society of America, 127(2):823–828, 2010.

[25] N. Raghuvanshi, R. Narain, and M. C. Lin. Efficient andAccurate Sound Propagation Using Adaptive RectangularDecomposition. IEEE Transactions on Visualization andComputer Graphics, 15(5):789–801, 2009.

[26] N. Raghuvanshi, J. Snyder, R. Mehra, M. C. Lin, and N. K.Govindaraju. Precomputed Wave Simulation for Real-TimeSound Propagation of Dynamic Sources in Complex Scenes.SIGGRAPH 2010, 29(3), July 2010.

[27] S. Sakamoto, A. Ushiyama, and H. Nagatomo. Numericalanalysis of sound propagation in rooms using finite differencetime domain method. Journal of the Acoustical Society ofAmerica, 120(5):3008, 2006.

[28] L. Savioja. Real-Time 3D Finite-Difference Time-DomainSimulation of Mid-Frequency Room Acoustics. In 13th In-ternational Conference on Digital Audio Effects (DAFx-10),Sept. 2010.

[29] S. Siltanen, T. Lokki, S. Kiminki, and L. Savioja. The roomacoustic rendering equation. The Journal of the AcousticalSociety of America, 122(3):1624–1635, Sept. 2007.

[30] U. P. Svensson, R. I. Fred, and J. Vanderkooy. An ana-lytic secondary source model of edge diffraction impulse re-sponses . Acoustical Society of America Journal, 106:2331–2344, Nov. 1999.

[31] M. T. Taylor, A. Chandak, L. Antani, and D. Manocha.RESound: interactive sound rendering for dynamic virtualenvironments. In MM ’09: ACM Multimedia, pages 271–280, New York, NY, USA, 2009. ACM.

[32] L. L. Thompson. A review of finite-element methods for time-harmonic acoustics. The Journal of the Acoustical Societyof America, 119(3):1315–1330, 2006.

[33] N. Tsingos. Pre-computing geometry-based reverberationeffects for games. In 35th AES Conference on Audio forGames, 2009.

[34] N. Tsingos, C. Dachsbacher, S. Lefebvre, and M. Dellepi-ane. Instant Sound Scattering. In Rendering Techniques(Proceedings of the Eurographics Symposium on Rendering),2007.

[35] N. Tsingos, T. Funkhouser, A. Ngan, , and I. Carlbom. Mod-eling acoustics in virtual environments using the uniformtheory of diffraction. In Computer Graphics (SIGGRAPH2001), August 2001.

[36] N. Tsingos and J. D. Gascuel. A general model for thesimulation of room acoustics based on hierachical radiosity.In ACM SIGGRAPH 97, SIGGRAPH ’97, New York, NY,USA, 1997. ACM.

[37] M. C. Vigeant. Investigations of incorporating source direc-tivity into room acoustics computer models to improve aural-izations. The Journal of the Acoustical Society of America,124(5):2664–2664, 2008.

[38] C. Zheng and D. L. James. Harmonic fluids. In ACMSIGGRAPH 2009 papers, pages 37:1–37:12, New York, NY,USA, 2009. ACM.

[39] D. N. Zotkin, R. Duraiswami, and N. A. Gumerov. Plane-wave decomposition of acoustical scenes via spherical andcylindrical microphone arrays. IEEE Trans. Audio, Speechand Language Processing, 18, 2010.