SOLODUCHA TELEKOMINNOVLAB final.ppt - ETSI · 2012. 11. 27. · Title: Microsoft PowerPoint -...
Transcript of SOLODUCHA TELEKOMINNOVLAB final.ppt - ETSI · 2012. 11. 27. · Title: Microsoft PowerPoint -...
Life is for sharing.
Recent activities on speech quality assessment
Michal Soloducha, Janto Skowronek, prof. Alexander Raake, prof. Sebastian Möller
Assessment of IP-based Applications
TU Berlin, Telekom Innovation Laboratories
ETSI TC STQ Workshop, Vienna, 27th Nov. 2012
Overview
1. Wideband E-Model
2. Evaluating of full reference models in test network
3. Multiparty conferencing quality
2
1. Wideband E-model
P.564 (conformance testing)
G.107 (NB E-model)
G.107.1 (WB E-model)
ITU-T G.107 (E-model):
Non-intrusive
Parameter-based
Model
Transmission-System
Source video signal
(SRC)Subjective
quality-rating
Estimated quality index
Bitstream / Parameters
WB E-model (ITU-T Rec. G.107.1)Wideband scale extension & framework
SNR-related
base-quality
simultaneous
delayed
Codec (incl. packet loss)
AeffIeIdIsRoR WBWBWBWB +−−−= ,
4
ITU-T Rec. G.107 (1999 – …)
Tool for network-planning
NB (300-3400 Hz):
WB (50-7000 Hz):
WB E-model (ITU-T G.107.1)
Included in ITU-T G.107.1:
Noise
Codecs
Packet loss
Missing aspect: User interfaces Missing aspect: User interfaces
Example: Electro-acoustic interfaces
In addition: Noise & echo cancelling
5
Noise – NB vs. WB E-Model
Two listening tests
24 subjects
6 hidden anchors
Variables
Send Loudness Rating (SLR )
Circuit noise Nc
Noise floor Nfor
)(5.115 SLRNoRoNB +⋅−=
Noise floor Nfor
Ambient noise at send side (Ps)
)(5.120 SLRNoRoWB +⋅−=
6
Bandwidth Model (not included in E-Model)
"
[Raake, 2006]
… Bandwidth [Bark]
… Center-frequency [Hz]
7
Equivalent rectangular bandwidth
( )( )( )( )
( )bwGu
bwGl
j
j
zzz
zzz
KeH
KeHzbw
+=
−=
+
+=
Ω
Ω
2/
2/
)10log20max(
)10log20(area
zu
H|)
(Zwicker & Fastl 1999)
(Raake 2006)
( )
ulc
jj
bwGu
fff
zgf
zzz
⋅=
=
+= 2/
zG
zl20 lo
g10(|
H
8
Packet loss NB vs. WB E-Model Ie Equipment impairment
factor (codec specific)
Ppl Packet-loss rate [%]
Bpl Packet-loss robustness
(codec/PLC specific) Effective Equipment Impairment Factor, NB E-
Model
BplPpl
PplIeIeeffIe
+⋅−+= )95(,
ITU-T G.113 App. I
WB E-Model:Least-square curve-fitting of test data with
where:
BplPpl
PplxIewbIeeffwbIe
+⋅−+= ),95(,,,
=codec WBif,
codec NB if,,
wbIe
IexIe
ITU-T G.113 App. VI
9
2. Evaluating of full reference models in test network
Full Reference models:
ITU-T P.862: PESQ
ITU-T P.863: POLQA
Transmission-System
Source signal
(SRC)Subjective
quality-rating
Degraded signal
ModelEstimated
quality indexReference
10
Setup configuration
Clients
PJSUA - open-source command line VoIP client with PLC, EC and VAD algorithms
Codecs:
G.711
G.722
QoS cases:
with QoS - using DTAG VoIP service
best effort – no VoIP traffic prioritisation
Traffic load (on the 1Gbit link):
UDP load: 0, 200, 400 [Mbps]
TCP load: 0, 250, 450 [Mbps]
11
Measurements in test network– language dependency
Configuration:
PLC, EC, VAD enabled
NB – G.711
WB – G.722
0,50%
Packet
loss:
12
0,50%
0,00%
0,57%
0,00%
Measurements in test network– language dependency
0,50%
Packet
loss:
Configuration:
PLC, EC, VAD enabled
NB – G.711
WB – G.722
13
0,50%
0,00%
0,57%
0,00%
12 Conditions
Number of interlocutors (reflecting Communication Complexity)
#IL = 2, 3, 4, 6
Audio reproduction method (reflecting Technical System Capability)
[ SoundScapeRenderer, Geier2008 ]
3. Multiparty conferencing quality - Test Setup
Capability) via headphones
SndRepr =
1. Narrowband – non-spatial = 0.3-3.4 kHz, diotic
2. Fullband – non-spatial = 0-20 kHz, diotic
3. Fullband – Spatial audio with headtracking = 0-20 kHz, dichotic
[ SoundScapeRenderer, Geier2008 ]
14
[Skowronek]
Analysis 5: SndRepr ⇒ Overall Quality of Experience
Higher Technical System Capability
⇒ higher Overall Quality
ANOVA: significant for all measures
PostHoc: for almost all pairs
Coding of variables:
High values = high quality
higher Overall Quality
15
[Skowronek]
Analysis 6: #IL ⇒ Overall Quality or Experience
ANOVA: significant for all measures
PostHoc: for OvQual: 2-6
for Satisfac: 2-3, 2-4, 2-6
for Pleasant: 2-3, 2-4, 2-6
for Accept: 2-4, 2-6
Coding of variables:
High values = high quality
Higher Communication Complexity
⇒ lower Overall Quality
16
[Skowronek]
Technical System Capability
(Audio Reproduction Method)
Communication Complexity
(Number of Interlocutors)
Speech Communication Quality
Cognitive Load
Spatial audio 2-party vs. multiparty
Summary
Overall Quality of Experience
2-party vs. multiparty
17
[Skowronek]
Thank you for your attention!
18
Backup.
27.11.2012YYMMDD_Software-Factory-Template_v0-08.ppt 19
Noise – handling by NB E-model G.107
Circuit & ambient noise levels referred to “0 dBr-point”
Noise levels for different classes of relevant noise sources
Nc ≡ power sum of all circuit noise sources
Nos, Nor ≡ transformed ambient noise levels Ps & Pr (send & receive side) ETR 250
Nfo = Nfor + RLR, i.e. transformed “noise floor” Nfor ≡ subscriber line noise
Attenuation of sound pressure at send & receive side Attenuation of sound pressure at send & receive side
Send, Receive & Overall Loudness Ratings (SLR, RLR, & OLR)
Overall noise level
+++= 10101010 10101010log10
NfoNorNosNc
No
20
Codecs & bandwidth impairmentAssumption
Residual impairment Ires
Impairment due to non-linear part
s(k)
+h(k)
n(k)
y(k)x(k)IbwIeIres WB −=
21
linear subsystem
non-linear subsystem
S
j
jj
f
f
exx
exyeH π2 with ,
)(
)()( =Ω
Φ
Φ=
Ω
Ω
Ω
Tabulated Impairment Factors
CodecsResidual impairment & tandeming
IbwIeIres WB −=
21 IresIresIbwIe totalWB ++=
Residual impairment
Codec tandems
22
21 IresIresIbwIe totalWB ++=
Codec tandems & add. linear distortion
∑+=i
itotalWB IresIbwIe
Measurements in test network– language dependency
NB – G.711
WB – G.722
0,85%
Packet
loss:
23
0,85%
0,00%
0,95%
0,00%
Measurements in test network– language dependency
NB – G.711
WB – G.722
0,00%
Packet
loss:
24
0,00%
0,00%
0,00%
0,00%
Multiparty conferencing quality
Technical System Capability
(Audio Reproduction Method)
Communication Complexity
(Number of Interlocutors)
???? ????
???? ????
Speech Communication Quality
Cognitive Load
Assessment Results
System or Situational
Aspects
???? ????
???? ????Overall Quality of Experience
25
[Skowronek]
Subject pool & Analysis method
Subject pool:
10 Female, 15 Male, Age 23 - 43
All experienced with multi-party telephone conversations
Analysis method:
Errorbar plots to visualize directions of effects, grouped along Errorbar plots to visualize directions of effects, grouped along SndRepr and #IL
One-way repeated-measures ANOVAs: 12 measures as function of SndRepr and #IL
PostHoc tests (estimated marginal means) for pairwise comparisons
26
[Skowronek]