Brent Cowan and Bill Kapralos · e.g., footsteps in a small room vs. footsteps outside in a large...

23
1 Brent Cowan and Bill Kapralos Faculty of Business and Information Technology, University of Ontario Institute of Technology 2000 Simcoe Street North, Oshawa, Ontario, Canada. L1H 7K4.

Transcript of Brent Cowan and Bill Kapralos · e.g., footsteps in a small room vs. footsteps outside in a large...

Page 1: Brent Cowan and Bill Kapralos · e.g., footsteps in a small room vs. footsteps outside in a large open fielda large open field Allows users to orient themselves Increases situational

1

Brent Cowan and Bill Kapralos

Faculty of Business and Information Technology, University of Ontario Institute of Technology

2000 Simcoe Street North, Oshawa, Ontario, Canada. L1H 7K4.

Page 2: Brent Cowan and Bill Kapralos · e.g., footsteps in a small room vs. footsteps outside in a large open fielda large open field Allows users to orient themselves Increases situational

2

Motivation (1):Importance of Real World Sounds

Sounds give detailed info of our surroundings

Determine direction and distance to objects

Warn of approaching dangers → particularly important in the “animal kingdom” e.g. predatorspo ta t t e a a gdo e g p edato s

Unlike vision, hearing is omni-directional

Can hear in complete darkness!

Can guide the more “finely tuned” visual system

UOIT Student Research Day – August 22 2008

Eases the burden of the visual system

Page 3: Brent Cowan and Bill Kapralos · e.g., footsteps in a small room vs. footsteps outside in a large open fielda large open field Allows users to orient themselves Increases situational

3

Motivation (2):Importance of Real

World Sounds (cont.)We do not need to see a “roaring” lion to realize that we may be in athat we may be in a potentially dangerous situation

The lion’s roar is

UOIT Student Research Day – August 22 2008

enough!

Page 4: Brent Cowan and Bill Kapralos · e.g., footsteps in a small room vs. footsteps outside in a large open fielda large open field Allows users to orient themselves Increases situational

4

Motivation (3):Importance of Real

World Sounds (cont.)We do not need to see an “angry” dog to realize that we may be in athat we may be in a potentially dangerous situation

The dog’s bark is

UOIT Student Research Day – August 22 2008

enough!

Page 5: Brent Cowan and Bill Kapralos · e.g., footsteps in a small room vs. footsteps outside in a large open fielda large open field Allows users to orient themselves Increases situational

5

Motivation (4):Sound is an Essential Part of Any Immersive

Environment (VR, Games, etc.)Conveys basic information to the the users

e.g., footsteps in a small room vs. footsteps outside in a large open fielda large open field

Allows users to orient themselves

Increases situational awareness

Helps increase immersion and hence presence

UOIT Student Research Day – August 22 2008

Helps increase immersion and hence presence

Can enhance perception of poor video

Can provide a sense of ambience → mood and emotion

Page 6: Brent Cowan and Bill Kapralos · e.g., footsteps in a small room vs. footsteps outside in a large open fielda large open field Allows users to orient themselves Increases situational

6

Motivation (5):Sound is an Essential Part of Any Immersive

Environment (VR, Games, etc.) (cont.)Although definitely downplayed, sound has actually been a key element of video games from the “early times”times

Consider the following sample

Does it sound familiar ?

UOIT Student Research Day – August 22 2008

Page 7: Brent Cowan and Bill Kapralos · e.g., footsteps in a small room vs. footsteps outside in a large open fielda large open field Allows users to orient themselves Increases situational

7

Motivation (6):Sound is an Essential Part of

Any Immersive Environment

(VR, Games, etc.) (cont.)Namco’s Pac-man (1980)

“The world’s most popular arcade video game ever”

You can still recollect a “key” sound in this game → sound is

UOIT Student Research Day – August 22 2008

sound in this game → sound is more important than you might have thought!

Page 8: Brent Cowan and Bill Kapralos · e.g., footsteps in a small room vs. footsteps outside in a large open fielda large open field Allows users to orient themselves Increases situational

8

Motivation (3):Spatial Sound Often Ignored in a VE

When present, typically:

Cues are poor → don’t always reflect natural spatial cues

“Far-field” acoustical model assumed → sounda e d acoust ca ode assu ed sou dsource at infinity, plane waves

Emphasis typically placed on visual senses

Graphics

UOIT Student Research Day – August 22 2008

Stereo vision, etc…

Page 9: Brent Cowan and Bill Kapralos · e.g., footsteps in a small room vs. footsteps outside in a large open fielda large open field Allows users to orient themselves Increases situational

9

Overview (1):What is Auralization ?

According to Kleiner et al.

The process of rendering audible, by physical or mathematical modeling, the sound field of a source in space in such a way as to simulate the binauralspace in such a way as to simulate the binaural listening experience at a given position in the modeled space

Goal → recreate a particular listening environment, taking into account the acoustics of the environment

UOIT Student Research Day – August 22 2008

taking into account the acoustics of the environment (e.g., the “room acoustics”), and the characteristics of the listener

Page 10: Brent Cowan and Bill Kapralos · e.g., footsteps in a small room vs. footsteps outside in a large open fielda large open field Allows users to orient themselves Increases situational

10

Overview (2):What is Auralization ? (cont.)

Auralization can be realized by determining the binaural room impulse response (BRIR)

BRIR represents the response of a particular acoustical environment and human listener to sound energy and captures the room acoustics for a particular sound source and listener configuration

UOIT Student Research Day – August 22 2008

Page 11: Brent Cowan and Bill Kapralos · e.g., footsteps in a small room vs. footsteps outside in a large open fielda large open field Allows users to orient themselves Increases situational

11

Overview (4):For Simplicity, Typically Decomposed Into Two Components

Room impulse response (RIR)Represents the reflection (reverberation), diffraction, refraction, sound attenuation, and absorptionrefraction, sound attenuation, and absorption properties of a particular room configuration The environmental context of a listening room or the “room acoustics”

Head-related transfer function (HRTF)

UOIT Student Research Day – August 22 2008

Head-related transfer function (HRTF)Filtering of sound spectrum by interactions of sound with head, torso, and particularly pinna

Page 12: Brent Cowan and Bill Kapralos · e.g., footsteps in a small room vs. footsteps outside in a large open fielda large open field Allows users to orient themselves Increases situational

12

Sound Localization (1):Head Related Transfer Function (HRTF)

Filt i f d t b i t ti f d ithFiltering of sound spectrum by interactions of sound with head, torso and particularly pinna

Pinna:Series of grooves and notches which accentuate or suppress mid & high frequency components in a position dependant manner

UOIT Student Research Day – August 22 2008

dependant manner

Each person’s pinna differs →filtering effects differ

Page 13: Brent Cowan and Bill Kapralos · e.g., footsteps in a small room vs. footsteps outside in a large open fielda large open field Allows users to orient themselves Increases situational

13

Spatial Audio (1):Binaural Synthesis

Assume the HRTF and RIR can both be modeled by a linear time invariant (LTI) filters

Measure or model the HRTF and RIR → resulting transfer function can be used to filter a source sound

Combine the HRTF and RIR-filtered (processed) sounds via a post-processing operation

When presented to the listener the impression of the environment being synthesized is recreated

UOIT Student Research Day – August 22 2008

environment being synthesized is recreated

Page 14: Brent Cowan and Bill Kapralos · e.g., footsteps in a small room vs. footsteps outside in a large open fielda large open field Allows users to orient themselves Increases situational

14

Graphics Processing Unit (GPU) (1):Overview

A dedicated graphics rendering device for a personal computer, workstation, or game console

Very efficient at manipulating and displaying computer graphics, and their highly parallel structure makes themgraphics, and their highly parallel structure makes them more effective than general-purpose CPUs for a range of complex algorithms

Modern GPUs use most of their

UOIT Student Research Day – August 22 2008

power to do calculations related

to 3D computer graphics

Page 15: Brent Cowan and Bill Kapralos · e.g., footsteps in a small room vs. footsteps outside in a large open fielda large open field Allows users to orient themselves Increases situational

15

Graphics Processing Unit (GPU) (2):Overview (cont.)

Modern GPUs contain a programmable pipelineModern GPUs contain a programmable pipeline

User flexibility → programmer is free to exploit the inherent power of the GPU

Shader → GPU program written in one of many “shader languages”

Can exploit GPU power for non-computer graphics applications

General purpose GPU or GPGPU

UOIT Student Research Day – August 22 2008

General purpose GPU or GPGPU→ many applications including computer vision, audio…

Page 16: Brent Cowan and Bill Kapralos · e.g., footsteps in a small room vs. footsteps outside in a large open fielda large open field Allows users to orient themselves Increases situational

16

Goals of this Work (1):Application of the GPU to Spatial Audio

Take advantage of the tremendous computational power of the graphics processing unit

Develop a real-time, one dimensional convolution method that utilizes the GPU that can be employed p yfor the generation of spatial audio

Allow for the inclusion of plausible spatial audio in interactive virtual environments and games

P id i f l ft b d

UOIT Student Research Day – August 22 2008

Provide a comparison of general software-based convolution and GPU-based convolution

Page 17: Brent Cowan and Bill Kapralos · e.g., footsteps in a small room vs. footsteps outside in a large open fielda large open field Allows users to orient themselves Increases situational

17

Method / Implementation (1):Implementation

OpenGL Shading Language

Executed on typical ypprogrammable graphics cards

UOIT Student Research Day – August 22 2008

Page 18: Brent Cowan and Bill Kapralos · e.g., footsteps in a small room vs. footsteps outside in a large open fielda large open field Allows users to orient themselves Increases situational

18

Results (1):Comparison

Running time comparison between software-based and GPU-based convolution

Input signal → “sine-wave” whose size varied from 5,000 – 60,000 samples in increments of 5,000, , p ,

HRTF → obtained from the CIPIC HRTF dataset and consisted of 200 samples

All tests were performed on a Dell XPS 720 high-end gaming PC Intel Core 2 6700 (2 66GHz) with

UOIT Student Research Day – August 22 2008

end gaming PC → Intel Core 2 6700 (2.66GHz) with an NVIDIA GeForce 8800 GTX graphics card

Page 19: Brent Cowan and Bill Kapralos · e.g., footsteps in a small room vs. footsteps outside in a large open fielda large open field Allows users to orient themselves Increases situational

19

Results (2):Comparison (cont.)

Graphical summary

UOIT Student Research Day – August 22 2008

Page 20: Brent Cowan and Bill Kapralos · e.g., footsteps in a small room vs. footsteps outside in a large open fielda large open field Allows users to orient themselves Increases situational

20

Results (3):Graphical Comparison

Hi h d b tHigher-order bytes

Visually, there appears to be no difference between software-based and GPU-based convolution

UOIT Student Research Day – August 22 2008

Page 21: Brent Cowan and Bill Kapralos · e.g., footsteps in a small room vs. footsteps outside in a large open fielda large open field Allows users to orient themselves Increases situational

21

Results (4):Graphical Comparison

Hi h d b tHigher-order bytes

Visually, it is evident that artifacts (noise) are introduced to the lower-order bytes of the GPU-based convolution output

UOIT Student Research Day – August 22 2008

Page 22: Brent Cowan and Bill Kapralos · e.g., footsteps in a small room vs. footsteps outside in a large open fielda large open field Allows users to orient themselves Increases situational

22

Conclusions (1):Summary

Development of a GPU-based convolution method using the OpenGL Shading Language

Real-time performance → constant running time of approximately 4ms with a filter with 200 coefficientspp y

Can be executed on general graphics cards that support programmable GPUs

Convolution is vital to the generation of spatial (3D) d

UOIT Student Research Day – August 22 2008

sound

This work demonstrates that real-time spatial sound is now possible!

Page 23: Brent Cowan and Bill Kapralos · e.g., footsteps in a small room vs. footsteps outside in a large open fielda large open field Allows users to orient themselves Increases situational

23

Conclusions (2):Future Work

Despite the real-time performance, the method does introduce artifacts (noise) to the resulting filtered signal

Lower order bytes only affected

Hearing is a perceptual processHearing is a perceptual process

Will these artifacts have any perceptual consequences ?

User tests must be conducted to examine what (if

UOIT Student Research Day – August 22 2008

any) role these artifacts have to the listener