Brent Cowan and Bill Kapralos · e.g., footsteps in a small room vs. footsteps outside in a large...

1

Brent Cowan and Bill Kapralos

Faculty of Business and Information Technology, University of Ontario Institute of Technology

2000 Simcoe Street North, Oshawa, Ontario, Canada. L1H 7K4.

2

Motivation (1):Importance of Real World Sounds

Sounds give detailed info of our surroundings

Determine direction and distance to objects

Warn of approaching dangers → particularly important in the “animal kingdom” e.g. predatorspo ta t t e a a gdo e g p edato s

Unlike vision, hearing is omni-directional

Can hear in complete darkness!

Can guide the more “finely tuned” visual system

UOIT Student Research Day – August 22 2008

Eases the burden of the visual system

3

Motivation (2):Importance of Real

World Sounds (cont.)We do not need to see a “roaring” lion to realize that we may be in athat we may be in a potentially dangerous situation

The lion’s roar is


enough!

4

Motivation (3):Importance of Real

World Sounds (cont.)We do not need to see an “angry” dog to realize that we may be in athat we may be in a potentially dangerous situation

The dog’s bark is


enough!

5

Motivation (4):Sound is an Essential Part of Any Immersive

Environment (VR, Games, etc.)Conveys basic information to the the users

e.g., footsteps in a small room vs. footsteps outside in a large open fielda large open field

Allows users to orient themselves

Increases situational awareness

Helps increase immersion and hence presence


Helps increase immersion and hence presence

Can enhance perception of poor video

Can provide a sense of ambience → mood and emotion

6

Motivation (5):Sound is an Essential Part of Any Immersive

Environment (VR, Games, etc.) (cont.)Although definitely downplayed, sound has actually been a key element of video games from the “early times”times

Consider the following sample

Does it sound familiar ?


7

Motivation (6):Sound is an Essential Part of

Any Immersive Environment

(VR, Games, etc.) (cont.)Namco’s Pac-man (1980)

“The world’s most popular arcade video game ever”

You can still recollect a “key” sound in this game → sound is


sound in this game → sound is more important than you might have thought!

8

Motivation (3):Spatial Sound Often Ignored in a VE

When present, typically:

Cues are poor → don’t always reflect natural spatial cues

“Far-field” acoustical model assumed → sounda e d acoust ca ode assu ed sou dsource at infinity, plane waves

Emphasis typically placed on visual senses

Graphics


Stereo vision, etc…

9

Overview (1):What is Auralization ?

According to Kleiner et al.

The process of rendering audible, by physical or mathematical modeling, the sound field of a source in space in such a way as to simulate the binauralspace in such a way as to simulate the binaural listening experience at a given position in the modeled space

Goal → recreate a particular listening environment, taking into account the acoustics of the environment


taking into account the acoustics of the environment (e.g., the “room acoustics”), and the characteristics of the listener

10

Overview (2):What is Auralization ? (cont.)

Auralization can be realized by determining the binaural room impulse response (BRIR)

BRIR represents the response of a particular acoustical environment and human listener to sound energy and captures the room acoustics for a particular sound source and listener configuration


11

Overview (4):For Simplicity, Typically Decomposed Into Two Components

Room impulse response (RIR)Represents the reflection (reverberation), diffraction, refraction, sound attenuation, and absorptionrefraction, sound attenuation, and absorption properties of a particular room configuration The environmental context of a listening room or the “room acoustics”

Head-related transfer function (HRTF)


Head-related transfer function (HRTF)Filtering of sound spectrum by interactions of sound with head, torso, and particularly pinna

12

Sound Localization (1):Head Related Transfer Function (HRTF)

Filt i f d t b i t ti f d ithFiltering of sound spectrum by interactions of sound with head, torso and particularly pinna

Pinna:Series of grooves and notches which accentuate or suppress mid & high frequency components in a position dependant manner


dependant manner

Each person’s pinna differs →filtering effects differ

13

Spatial Audio (1):Binaural Synthesis

Assume the HRTF and RIR can both be modeled by a linear time invariant (LTI) filters

Measure or model the HRTF and RIR → resulting transfer function can be used to filter a source sound

Combine the HRTF and RIR-filtered (processed) sounds via a post-processing operation

When presented to the listener the impression of the environment being synthesized is recreated


environment being synthesized is recreated

14

Graphics Processing Unit (GPU) (1):Overview

A dedicated graphics rendering device for a personal computer, workstation, or game console

Very efficient at manipulating and displaying computer graphics, and their highly parallel structure makes themgraphics, and their highly parallel structure makes them more effective than general-purpose CPUs for a range of complex algorithms

Modern GPUs use most of their


power to do calculations related

to 3D computer graphics

15

Graphics Processing Unit (GPU) (2):Overview (cont.)

Modern GPUs contain a programmable pipelineModern GPUs contain a programmable pipeline

User flexibility → programmer is free to exploit the inherent power of the GPU

Shader → GPU program written in one of many “shader languages”

Can exploit GPU power for non-computer graphics applications

General purpose GPU or GPGPU


General purpose GPU or GPGPU→ many applications including computer vision, audio…

16

Goals of this Work (1):Application of the GPU to Spatial Audio

Take advantage of the tremendous computational power of the graphics processing unit

Develop a real-time, one dimensional convolution method that utilizes the GPU that can be employed p yfor the generation of spatial audio

Allow for the inclusion of plausible spatial audio in interactive virtual environments and games

P id i f l ft b d


Provide a comparison of general software-based convolution and GPU-based convolution

17

Method / Implementation (1):Implementation

OpenGL Shading Language

Executed on typical ypprogrammable graphics cards


18

Results (1):Comparison

Running time comparison between software-based and GPU-based convolution

Input signal → “sine-wave” whose size varied from 5,000 – 60,000 samples in increments of 5,000, , p ,

HRTF → obtained from the CIPIC HRTF dataset and consisted of 200 samples

All tests were performed on a Dell XPS 720 high-end gaming PC Intel Core 2 6700 (2 66GHz) with


end gaming PC → Intel Core 2 6700 (2.66GHz) with an NVIDIA GeForce 8800 GTX graphics card

19

Results (2):Comparison (cont.)

Graphical summary


20

Results (3):Graphical Comparison

Hi h d b tHigher-order bytes

Visually, there appears to be no difference between software-based and GPU-based convolution


21

Results (4):Graphical Comparison

Hi h d b tHigher-order bytes

Visually, it is evident that artifacts (noise) are introduced to the lower-order bytes of the GPU-based convolution output


22

Conclusions (1):Summary

Development of a GPU-based convolution method using the OpenGL Shading Language

Real-time performance → constant running time of approximately 4ms with a filter with 200 coefficientspp y

Can be executed on general graphics cards that support programmable GPUs

Convolution is vital to the generation of spatial (3D) d


sound

This work demonstrates that real-time spatial sound is now possible!

23

Conclusions (2):Future Work

Despite the real-time performance, the method does introduce artifacts (noise) to the resulting filtered signal

Lower order bytes only affected

Hearing is a perceptual processHearing is a perceptual process

Will these artifacts have any perceptual consequences ?

User tests must be conducted to examine what (if


any) role these artifacts have to the listener

Brent Cowan and Bill Kapralos · e.g., footsteps in a small room vs. footsteps outside in a large...

Documents

Transcript of Brent Cowan and Bill Kapralos · e.g., footsteps in a small room vs. footsteps outside in a large...