Dynamic Aspects of the Cocktail Party Listening Problem Douglas S. Brungart Air Force Research...

Dynamic Aspects of the Cocktail Party Listening Problem

Douglas S. BrungartAir Force Research Laboratory

2

Credits

AFOSR Sponsored Research

Team:

Brian Simpson

Alex Kordik

Rich McKinley

Mark Ericson

Collaborators:

Chris Darwin

Gerald Kidd

3

Introduction

1) Energetic and Informational Masking:

Speech in Noise vs Speech in Speech

2) Monaural speech segregation

3) Binaural and Dichotic speech segregation

4) Dynamic aspects of cocktail party problem

5) Audio-Visual cocktail party effects

4

Energetic Masking

In classic speech-on-noise masking, only one type of masking occurs: Energetic Masking

In Energetic Masking:

-The masking sound is more intense than the target in one or more critical bands

-Some portion of the target signal is inaudible at the periphery

5

Energetic MaskingArticulation Theory

Energetic masking in speech was studied for years by Fletcher and others at Bell Labs

-Articulation Theory

-Articulation Index (AI)

Allows accurate prediction of intelligibility:

-For any phonetically balanced vocabulary

-For any continuous noise source

-Plus numerous correction factors

High-Amplitudes, Reverb, Peak-Clipping, etc.

6

Informational Masking

Energetic Masking also occurs in Speech-on-Speech masking

-Where signals overlap within critical band

However, informational masking also occurs:

• Listeners hear two or more audible sounds, but can’t segregate them into separate messages

• Classic example: multi-tone complexes

- No energetic overlap in stimuli, but substantial masking is observed (Kidd, Neff)

7

Data collected with Coordinate Response Measure

-CRM Originally developed by Moore & McKinley (1980)

- Format: Ready (Call Sign) go to (Color) (Number) now.

- Target is indicated by call sign Baron

- Maskers indicated by other call signs

- Complete CRM corpus is available (Bolia et. al, 2001)

- 8 Talkers in corpus (4 M, 4 F), 2048 Phrases

- 8 Talkers x 4 Colors x 8 Numbers x 8 Call Signs

- Embedded call-sign ideal for multitalker studies

- Similar to many multichannel monitoring tasks

MethodsThe Coordinate Response Measure (CRM)

8

Listeners respond by selecting the appropriate colored digit with the computer mouse

MethodsThe Coordinate Response Measure

9

MethodsPros and Cons of CRM

Advantages of CRM:

Rapid data collection: training and scoring

Sentences are reusable

Embedded call sign to designate target

- does not require a priori designation

Disadvantages of CRM:

Limited vocabulary

- partially offset by lack of context

- not phonetically balaced

Not “conversationally” realistic

CRM emphasizes “speech on speech” masking

10


Advantages of CRM:






Limited vocabulary




CRM emphasizes “speech on speech” masking

11


Advantages of CRM:






Limited vocabulary




CRM emphasizes “informational” masking

12

Two-Talker Diotic ListeningResults

TM=Mod. Noise Masker

TN=Cont. Noise Masker

TD=Diff. Sex Masker

TS=Same Sex Masker

TT=Same Talker Masker

13

Two-Talker Diotic ListeningError Distribution

Most errors match the color and number spoken by the masking talker….

This is indicative of informational masking

14

Three-Talker Diotic ListeningResults

T=Target Talker

M=Mod. Noise Masker

D=Diff. Sex Masker

S=Same Sex Masker

T=Same Talker Masker

15

Four-Talker Diotic ListeningResults

T=Target Talker

M=Mod. Noise Masker

D=Diff. Sex Masker

S=Same Sex Masker

T=Same Talker Masker

16

3-4 Talker ListeningResults

17

Dichotic ListeningIntroduction

To this point, all stimuli have been diotic

• Spatial separation is known to play a role

- Cherry’s “Cocktail Party Problem”

• Dichotic masking is pure informational masking

- No contralateral energetic masking occurs

• Previous results have suggested:

- Almost perfect segregation across ears

- Cherry, Broadbent, Triesman, Kidd, Neff, etc.

18

Dichotic ListeningProcedure

Dichotic listening similar to other procedure but

1) Talkers were known a priori

- 1 male, 1 female target talker

2) 2 Talkers presented in right ear (T and M)

3) Masking signal was presented in left ear

19

Dichotic ListeningResults

With 2 talkers in right ear…

Noise in left ear doesn’t interfere

(Even when Loud)

Speech interferes substantially…

(Even when Quiet)

Reversed Speech interferes…

but only when

target in right ear lower than

masker in right ear

20

Binaural ListeningSpatial Separation in Azimuth

From the classic “cocktail party effect”

Spatial separation improves segregation

Diotic vs.

45˚ Separation,

same-sex

talkers

21

Binaural ListeningSpatial Separation in Distance

22


With Natural

Better-Ear SNR Cues,

Both speech and noise

Benefit from separation in

distance

23


With normalization, speech is

Better but Noise is not

24

Dynamic Aspects of Multitalker Listening

Most Cocktail-Party Listening Experiments assume

1) Target talker is known (“Selective Attention”)

2) Target talker is unknown (“Divided Attention”)

Real world listening falls in between these extremes

- Attention focused primarily on one talker

- Other talkers monitored for “important” info

How do listeners adapt to conversational dynamics

25

Dynamic Cocktail Party EffectsMultitalker Transition Probability

Experiment: 3-Talker Condition

1) Standard CRM task

2) 2, 3, or 4 Spatially Separated Same-Sex Talkers

- Close or Far separation for 2 and 3 talkers

3) 5 Transition Probabilities (0-1)

4) 3 Talker Configurations

- Talkers selected randomly

- Each location assigned a talker

- Target talker follows target location

5) Total of 106,200 Trials

- Balanced by Target Talker and Target Location

26

Dynamic Cocktail Party Effects Multitalker Transition Probability

Overall Perfomance Improves Gradually After Transitions

27

Conclusions

?

1) Speech-on-Speech Speech-in-Noise

- Deployment of Auditory Attention is Important

- Signal “similarity” is a major factor

- Spatial separation is particularly beneficial

2) Multitalker Listening is a Dynamic Process

- Listeners adapt to source location changes over 5-8 trials

- Listeners learn new situations quickly (10 trials)

- Listeners adopt optimal listening strategies

Dynamic Aspects of the Cocktail Party Listening Problem Douglas S. Brungart Air Force Research...

Documents

Transcript of Dynamic Aspects of the Cocktail Party Listening Problem Douglas S. Brungart Air Force Research...