Speech Quality Assessment for Wideband Communication...
Transcript of Speech Quality Assessment for Wideband Communication...
Speech Quality Assessment for Wideband Communication Scenarios
H. W. Gierlich, S. Völl, F. Kettler(HEAD acoustics GmbH)
P. Jax(IND, RWTH Aachen)
� Workshop on Wideband Speech Quality in Terminals and Networks
supported by:
Outline
� General aspects of speech quality in wideband systems
� Subjective evaluationsConversational testsSpeech intelligibilityBackground noise transmissionEcho tests
� Summary
speechquality
talkingsituation
listeningsituation
conversationalsituation
Speech Quality Parameters
� from the user�s perspective
Auditory Parameters�contributing to speech quality:
(Speech) sound quality
Quality of background noise transmission
Delay and echo
Double talk capability
Switching and echosingle talk/double talk
Loudness
(System) noise
Narrow band Wide bandDifferent quality
perception
Loudness (WB)
Different quality Perception ?
Different quality Perception ?
Double talk capability
Different quality Perception ?
Different quality Perception
Roadmap for the development of objective measurements
1. Conversational tests ! parameter identification (qualitative)
2. Listening-only tests ! quantitative judgement 3. Development of objective measurement
methods to reproduce the results of the LOT
! Quality evaluation of wideband systems without subjective tests
Outline
� General aspects of speech quality in wideband systems
� Subjective evaluationsConversational testsSpeech intelligibilityBackground noise transmissionEcho tests
� Summary
Conversational Tests
� Purpose: identification of parameters characterizing the communicational quality in wideband systems
� Test conditions:Experts tests�Kandisky�-test4 wideband codecs under test3 conditions for each codec: �normal� conversation, with music in office room, with babble in office room�free answering�
Setup and test procedure
� Setup:
� Test procedure:Different codecs includedOne echo-canceller for all tests
�normal office�
PC for
BGN
HFT 1
EC&CC
G.722
Sound-proof cabinetHFT 2
EC off
G.722
CodecA, B, C
Results
� Sorting the comments in categories:Speech sound quality, echo behavior, Quality of background noise, others (e.g. noise, clipping)
� Example: speech sound quality
rattles, crackles, blunt sound, ...BWE
high dynamic, hollow, clank, ...AMR-WB2
sounds rough, naturally, distorted, ...AMR-WB
sounds naturally, high dynamic ...G.722commentscodec
Conclusion
� Relevant parameters to be studied further:sound of speechEcho: level, masking, intelligibilityquality of background noise transmissionNoiseDouble talkSwitching/clipping
� Design of listening-only tests concerningspeech intelligibility: narrow band vs. wide bandquality of transmitted background noise annoyance of echo
Outline
� General aspects of speech quality in wideband systems
� Subjective evaluationsConversational testsSpeech intelligibilityBackground noise transmissionEcho tests
� Summary
Speech intelligibility test
� Sensitive test: logatom-test � Consonant � vowel � consonant� Informal test:
3x 12 test persons, 29 logatoms
� Test persons note the �word� they understood
Recording & Listening
� Recording:
� Listening:test persons listen to the artificial head recordings
�normal office� Sound-proof cabinetPCM /
AMR-WB /
ISDN
PC
0
20
40
60
80
100
ISDN PCM 16 bit AMR 12,65kbit/s
codec
% c
orre
ct lo
gato
ms
Results
� Increased intelligibilityfor wideband codecs
14 % ≅ 4 logatoms
Outline
� General aspects of speech quality in wideband systems
� Subjective evaluationsConversational testsSpeech intelligibilityBackground noise transmissionEcho tests
� Summary
Background noise assessment: music
� Background noise: additional informationabout talkers environment
� Tests with untrained persons: assessment on a 5 point MOS scaleexperts: assessment on a 5 point MOS scale and giving reasons why
� 16 different codecs under test
Background noise tests
� Recording listening samples:
� Listening:test persons listen to artificial head recordings
� ACR scale: excellent � good � fair � poor � bad
�normal office�
PC for
BGN
HFT 1sound-proof cabinet
HFT 2codecor filter
PC
Quality of transmitted background music
1
2
3
4
5
PCM
G722
MP3G.72
2.1G.72
2-HP
ISDN
G.722-
NBBW
E-U
AMR8G.72
2-TP
BWE-O
BWE-U
+O
AMR-2
IND0
AMR2CN
AMR-0
Codec
MO
S-LQ
S
Results
� 3 quality levels with significantly different MOS - values
- wideband- good �intelligibility�
of music
- narrowband- good �intelligibility�
of music
- mostly wideband- bad �intelligibility�
of music
Outline
� General aspects of speech quality in wideband systems
� Subjective evaluationsConversational testsSpeech intelligibilityBackground noise transmissionEcho tests
� Summary
Echo annoyance test
� Using hands-free telephones ! echo disturbances a dominant problem
� Investigation of the annoying aspects of echo using wide-band links:
influence of echo sound, influence of echo level,influence of codec, ...
� Mean one-way transmission time constantfor all listening samples: 170 ms
Echo annoyance tests
� Recording:
� Listening:test persons listen to the artificial head recordings: ! direct speech + echo
�normal office�
HFT 1
EC offCC off
sound-proof cabinet
HFT 2
EC on CC on
codecor filter
PCecho
Tests & assessment
� Tests with untrained persons: assessment on a 5 point MOS scaleexperts: assessment on a 5 point MOS scaleand giving reasons why
� DCR scale:5 � echo is inaudible4 � echo is audible, but not annoying3 � echo is slightly annoying2 � echo is annoying1 � echo is very annoying
Echo levels
� TCLw acc. ITU-T P.79
� Note: hands-free on boths sides, SLR = 7dB, RLR = 5dB (including HFT correction of 14 dB) => TELR(max) = 39dB
! Investigation of codec and echo level
TCLw = 27 dB ! �low� echo level21 dB ! �medium� echo level13 dB ! �high� echo level
�normal office�HFT 1
EC offCC off
echoRLR
Results
� Comparision of annoyance by echo level (experts)
!Differences for echos with the same echo level!Echo masked by direct speech
Echo annoyance
1
2
3
4
5
PCM G.722.1 MP3 G722 AMR, 12kbit/s
AMR WB2, 12kbit/s
BWE ISDN
codec
MO
S-LQ
S
high echo levelmedium echo levellow echo level
Influence of the bandwidth� Filtering of the echo signal
� Speech modulated noise (two examples)� Level adjustment to TCLw = ! �medium� echo
level
Filter
-30
-20
-10
0
10
20
30
10 100 1000 10000
widebandwideband f>300 Hzwideband f<3400narrow bandnarrow band lv. Adjlow freq. increasedlow freq. decreasedhigh freq. increasedhigh freq. strongly increased
-> f/Hz
dB
echo annoyance: dependency on bandwidth
1
2
3
4
5
wide-band
wide-band
f > 30
0 Hz
wide-band
f < 34
00 H
zna
rrow-ba
nd
narro
w-band
lv. a
dj.
low fre
q. incre
ased
low fre
q. dec
rease
d
high fre
q. incre
ased
high fre
q. stro
ngly
increa
sed
echo
= mod
ulated w
ith nois
eec
ho =
noise
Variante
MO
S-LQ
S
traineduntrained
Results
significant differences between experts and untrained test persons
Results echo annoyance
� High frequencies !!!! very annoying� Wide-band and low freq. !!!! slightly annoying� Experts more critical than untrained test
persons� Speech modulated noisy echo:
Experts: very annoying ! no advantagesUntrained: felt insecure
Summary (1)
� Background noise transmission - relevant aspects:Bandwidth�intelligibility� / brightness / low distortion (small difference to theoriginal)
� Echo annoyance - relevant aspects:LevelMasking propertiesDistortion and frequency characteristics
� Additional parameters to be investigated subjectively:
NoiseSwitching/clipping Double talk behavior