The Cocktail Party Effect. An inclusive vision of conversational interactions.
-
Upload
isabella-loddo -
Category
Design
-
view
68 -
download
2
Transcript of The Cocktail Party Effect. An inclusive vision of conversational interactions.
Isabella Loddo Università Iuav di Venezia
NTT Data Italia
Dario Martini Università degli Studi
della Repubblica di S.Marino
The cocktail party effect. An inclusive vision of conversational interactions
Myopia
Diversity
Different People
Different Devices
Interaction Model
Sullivan & Igoe, 2004Model Reality
Impairment
Microsoft, 2016
Captchas
Testing
Benchmarking Reaction
Testing + Feedback
Assistive devices
“For most of us, technology makes things easier. For a person with disability, it makes things possible.”
Judy Heumann
Assistive Technologies
Comparison
BulkyFragile
Expensive
EmbeddedConsistent
Cheap
Higher adoption BRAILLE SCREEN-READERS
Talking Machines
1889
Talking Machines
1939 >
synthesis recording
1927 >
1939 >
synthesis recording
1927 >
…
…
formant speech synthesis
concatenative speech synthesis
Talking Machines
Speech Synthesis
formant speech synthesis
concatenative speech synthesis
no database required
artificial sounding
more intelligible than human speech
large database
natural sounding
intelligible as human speech
Expression
Prosodyphrasing
pitch
loudness
tempo
rhythm Liu, 2006
phrasing
pitch
loudness
tempo
rhythm Hewlett et al, 2006
Prosody
phrasing
pitch
loudness
tempo
rhythm
Prosody
talking shouting
phrasing
pitch
loudness
tempo
rhythm
Oliveira, 2012
Prosodyenhance storytelling
express own speech style
Dlugan, 2012
phrasing
pitch
loudness
tempo
rhythm Hammen et al, 1994
Prosody
Intelligibility
Trouvain, 2007
concatenative
formant
Average 100
com
preh
ensio
n
rate (syllabes / second)
95
90
85
80
75
2 4 6 8 10 12 14 16 18
concatenative
formant
Blind
Speech Rates
Asakawa et al, 2003
50%80%50%80%sighted blind
350
wpm
300
400
450
500
250
200
150
100
50
0
software peak rate
software avg. rate
1.6 X
2.8 X
Relevant Scanning
On the web, scanning tasks are more frequent than linear readings.
75%50%
15%
Mobility
Mobile devices have become the main usage scenario.
100
75
50
25
02009 2010 2011 2012 2013 2014 2015 2016
mobile screen reader usage (%)
Situated listening
Users need to distinguish relevant information in real environments.
Cocktail party effect
1953
Guerreiro et al, 2015
Concurrent voices
10090
70
50
30
10
com
preh
ensio
n
information bandwidth
80
60
40
20
01 X 1,5 X 2 X 2,5 X 3 X 3,5 X 4 X 5 X4,5 X 5,5 X 6 X
1 voiceVoices
2 voices
3 voices
Concurrent voices
information bandwidth is increased equally in sighted and blind subjects
the maximum increase also depends on age
voice differences are preferred, but not essential
are very promising for web searches, news, assisted navigation, multi-touch simulations
Guerreiro et al, 2016
From Reading to Conversation
human level
Word Error Rate
-18% / year
thanks to artificial intelligence cloud computing
25
20
15
10
5
02008 2009 2010 2011 2012 2013 2014 2015 2016 2017
introduction of deep learning
Mutual Recognition
“Normal people, when they think about speech recognition, they want the whole thing. They want recognition, they want understanding and they want an action to be taken.”
Hsiao-Wuen Hon, Microsoft Research
Meaning
Mental models
screen reader: tool-like
vs
voice assistant: human-like
Qvarfordt, 2004
Black Mirror, 2013
Human-like
Human-like?
Affective Interaction
2013
Conclusions
Screen Readers are the most popular assistive technology
1
Conclusions
Naturalness has improved, but blind users still prefer formant synthesis
2
Conclusions
Screen Readers are slow in relevant scanning tasks
3
Conclusions
Screen Readers can be improved by applying the Cocktail Party Effect
4
Conclusions
Voice spatiality and formant synthesis can improve mobile experience
5
Scenario
Doctor
Crossing
Tickets
ParkingVegetablesBank
Thank youQuestions?
ISABELLA LODDO Università Iuav di Venezia NTT DATA Italia [email protected]
DARIO MARTINI Università degli Studi della Repubblica di San Marino [email protected]