SOLODUCHA TELEKOMINNOVLAB final.ppt - ETSI · 2012. 11. 27. · Title: Microsoft PowerPoint -...

Post on 22-Aug-2020

0 views 0 download

Transcript of SOLODUCHA TELEKOMINNOVLAB final.ppt - ETSI · 2012. 11. 27. · Title: Microsoft PowerPoint -...

Life is for sharing.

Recent activities on speech quality assessment

Michal Soloducha, Janto Skowronek, prof. Alexander Raake, prof. Sebastian Möller

Assessment of IP-based Applications

TU Berlin, Telekom Innovation Laboratories

ETSI TC STQ Workshop, Vienna, 27th Nov. 2012

Overview

1. Wideband E-Model

2. Evaluating of full reference models in test network

3. Multiparty conferencing quality

2

1. Wideband E-model

P.564 (conformance testing)

G.107 (NB E-model)

G.107.1 (WB E-model)

ITU-T G.107 (E-model):

Non-intrusive

Parameter-based

Model

Transmission-System

Source video signal

(SRC)Subjective

quality-rating

Estimated quality index

Bitstream / Parameters

WB E-model (ITU-T Rec. G.107.1)Wideband scale extension & framework

SNR-related

base-quality

simultaneous

delayed

Codec (incl. packet loss)

AeffIeIdIsRoR WBWBWBWB +−−−= ,

4

ITU-T Rec. G.107 (1999 – …)

Tool for network-planning

NB (300-3400 Hz):

WB (50-7000 Hz):

WB E-model (ITU-T G.107.1)

Included in ITU-T G.107.1:

Noise

Codecs

Packet loss

Missing aspect: User interfaces Missing aspect: User interfaces

Example: Electro-acoustic interfaces

In addition: Noise & echo cancelling

5

Noise – NB vs. WB E-Model

Two listening tests

24 subjects

6 hidden anchors

Variables

Send Loudness Rating (SLR )

Circuit noise Nc

Noise floor Nfor

)(5.115 SLRNoRoNB +⋅−=

Noise floor Nfor

Ambient noise at send side (Ps)

)(5.120 SLRNoRoWB +⋅−=

6

Bandwidth Model (not included in E-Model)

"

[Raake, 2006]

… Bandwidth [Bark]

… Center-frequency [Hz]

7

Equivalent rectangular bandwidth

( )( )( )( )

( )bwGu

bwGl

j

j

zzz

zzz

KeH

KeHzbw

+=

−=

+

+=

Ω

Ω

2/

2/

)10log20max(

)10log20(area

zu

H|)

(Zwicker & Fastl 1999)

(Raake 2006)

( )

ulc

jj

bwGu

fff

zgf

zzz

⋅=

=

+= 2/

zG

zl20 lo

g10(|

H

8

Packet loss NB vs. WB E-Model Ie Equipment impairment

factor (codec specific)

Ppl Packet-loss rate [%]

Bpl Packet-loss robustness

(codec/PLC specific) Effective Equipment Impairment Factor, NB E-

Model

BplPpl

PplIeIeeffIe

+⋅−+= )95(,

ITU-T G.113 App. I

WB E-Model:Least-square curve-fitting of test data with

where:

BplPpl

PplxIewbIeeffwbIe

+⋅−+= ),95(,,,

=codec WBif,

codec NB if,,

wbIe

IexIe

ITU-T G.113 App. VI

9

2. Evaluating of full reference models in test network

Full Reference models:

ITU-T P.862: PESQ

ITU-T P.863: POLQA

Transmission-System

Source signal

(SRC)Subjective

quality-rating

Degraded signal

ModelEstimated

quality indexReference

10

Setup configuration

Clients

PJSUA - open-source command line VoIP client with PLC, EC and VAD algorithms

Codecs:

G.711

G.722

QoS cases:

with QoS - using DTAG VoIP service

best effort – no VoIP traffic prioritisation

Traffic load (on the 1Gbit link):

UDP load: 0, 200, 400 [Mbps]

TCP load: 0, 250, 450 [Mbps]

11

Measurements in test network– language dependency

Configuration:

PLC, EC, VAD enabled

NB – G.711

WB – G.722

0,50%

Packet

loss:

12

0,50%

0,00%

0,57%

0,00%

Measurements in test network– language dependency

0,50%

Packet

loss:

Configuration:

PLC, EC, VAD enabled

NB – G.711

WB – G.722

13

0,50%

0,00%

0,57%

0,00%

12 Conditions

Number of interlocutors (reflecting Communication Complexity)

#IL = 2, 3, 4, 6

Audio reproduction method (reflecting Technical System Capability)

[ SoundScapeRenderer, Geier2008 ]

3. Multiparty conferencing quality - Test Setup

Capability) via headphones

SndRepr =

1. Narrowband – non-spatial = 0.3-3.4 kHz, diotic

2. Fullband – non-spatial = 0-20 kHz, diotic

3. Fullband – Spatial audio with headtracking = 0-20 kHz, dichotic

[ SoundScapeRenderer, Geier2008 ]

14

[Skowronek]

Analysis 5: SndRepr ⇒ Overall Quality of Experience

Higher Technical System Capability

⇒ higher Overall Quality

ANOVA: significant for all measures

PostHoc: for almost all pairs

Coding of variables:

High values = high quality

higher Overall Quality

15

[Skowronek]

Analysis 6: #IL ⇒ Overall Quality or Experience

ANOVA: significant for all measures

PostHoc: for OvQual: 2-6

for Satisfac: 2-3, 2-4, 2-6

for Pleasant: 2-3, 2-4, 2-6

for Accept: 2-4, 2-6

Coding of variables:

High values = high quality

Higher Communication Complexity

⇒ lower Overall Quality

16

[Skowronek]

Technical System Capability

(Audio Reproduction Method)

Communication Complexity

(Number of Interlocutors)

Speech Communication Quality

Cognitive Load

Spatial audio 2-party vs. multiparty

Summary

Overall Quality of Experience

2-party vs. multiparty

17

[Skowronek]

Thank you for your attention!

18

Backup.

27.11.2012YYMMDD_Software-Factory-Template_v0-08.ppt 19

Noise – handling by NB E-model G.107

Circuit & ambient noise levels referred to “0 dBr-point”

Noise levels for different classes of relevant noise sources

Nc ≡ power sum of all circuit noise sources

Nos, Nor ≡ transformed ambient noise levels Ps & Pr (send & receive side) ETR 250

Nfo = Nfor + RLR, i.e. transformed “noise floor” Nfor ≡ subscriber line noise

Attenuation of sound pressure at send & receive side Attenuation of sound pressure at send & receive side

Send, Receive & Overall Loudness Ratings (SLR, RLR, & OLR)

Overall noise level

+++= 10101010 10101010log10

NfoNorNosNc

No

20

Codecs & bandwidth impairmentAssumption

Residual impairment Ires

Impairment due to non-linear part

s(k)

+h(k)

n(k)

y(k)x(k)IbwIeIres WB −=

21

linear subsystem

non-linear subsystem

S

j

jj

f

f

exx

exyeH π2 with ,

)(

)()( =Ω

Φ

Φ=

Ω

Ω

Ω

Tabulated Impairment Factors

CodecsResidual impairment & tandeming

IbwIeIres WB −=

21 IresIresIbwIe totalWB ++=

Residual impairment

Codec tandems

22

21 IresIresIbwIe totalWB ++=

Codec tandems & add. linear distortion

∑+=i

itotalWB IresIbwIe

Measurements in test network– language dependency

NB – G.711

WB – G.722

0,85%

Packet

loss:

23

0,85%

0,00%

0,95%

0,00%

Measurements in test network– language dependency

NB – G.711

WB – G.722

0,00%

Packet

loss:

24

0,00%

0,00%

0,00%

0,00%

Multiparty conferencing quality

Technical System Capability

(Audio Reproduction Method)

Communication Complexity

(Number of Interlocutors)

???? ????

???? ????

Speech Communication Quality

Cognitive Load

Assessment Results

System or Situational

Aspects

???? ????

???? ????Overall Quality of Experience

25

[Skowronek]

Subject pool & Analysis method

Subject pool:

10 Female, 15 Male, Age 23 - 43

All experienced with multi-party telephone conversations

Analysis method:

Errorbar plots to visualize directions of effects, grouped along Errorbar plots to visualize directions of effects, grouped along SndRepr and #IL

One-way repeated-measures ANOVAs: 12 measures as function of SndRepr and #IL

PostHoc tests (estimated marginal means) for pairwise comparisons

26

[Skowronek]