Live Music Performances over High-Speed IP Networks Stefan Karapetkov Director, Emerging...

30
Live Music Performances over High-Speed IP Networks Stefan Karapetkov Director, Emerging Technologies TERENA Networking Conference Bruges, Belgium, May 20, 2008

Transcript of Live Music Performances over High-Speed IP Networks Stefan Karapetkov Director, Emerging...

Live Music Performances over High-Speed IP Networks

Stefan Karapetkov

Director, Emerging Technologies

TERENA Networking Conference

Bruges, Belgium, May 20, 2008

Agenda

Manhattan School of Music

Audio-Video Networks

Audio Technology Voice-specific Codec FunctionsAdjustments for Live Music Mode

Video Technology

Transmission Technology

Live Music Mode Demo

2

MSM Testimonial

3

Audio-Video Networks Today

Video Endpoints

Conference Servers

Call Control, Management & Scheduling

Video Recording, Streaming & Content Management

Security & NAT/FW Traversal

Security & NAT/FW Traversal

IM/Presence and IP-PBX Integration

User Database

Gatekeeper

Terminal A Terminal B1) H.225 SETUP 2) H.225 SETUP

6) H.245 CAPS, MS

8) H.245 CAPS, MS

5) H.245 CAPS, MS

4) H.225 CONNECT 3) H.225 CONNECT

7) H.245 CAPS, MS

9) H.245 OLC 10) H.245 OLC

IP Network

RTP/RTCP Stream

H.323 Architecture

2) 3

02 M

oved

Tem

pora

rily

7) ACK 8) ACK6) 200 OK 5) 200 OK

4) INVITE [email protected]

1) IN

VIT

E u

serB

@ho

me.

com

3) INVITE [email protected]

SIP Redirect Server

User Database

SIP ProxyRegistrar

IP Network

SIP User Agent A SIP User Agent B

RTP/RTCP Stream

SIP Architecture

Audio and Video Compression

7

Concert Site Remote Site

Advanced Audio Compression Technology

Data Bit Rate

Au

dio

Fid

elit

y G.722.1G.722.1G.722G.722

AMR-NBAMR-NB

G.711G.711

G.722.2G.722.2

Wid

eb

an

d

Na

rro

wb

an

d

4 kbps 64 kbps 128 kbps

Siren 14 stereoSiren 14 stereo

Siren 22 stereoSiren 22 stereo

G.729AG.729AG.728G.728

G.722.1CG.722.1C

Su

per

Wid

e

SirenTM22 Stereo Codec Highlights

9

SirenTM22 MP3

Optimized for low latency - 40ms

Frequency band 22kHz

Stereo

High latency – 54-81ms

Stereo

Frequency band 18kHz

Low complexity 15MIPS High complexity 100MIPS

Low bit rate – max. 128kbps

Optimized for storage - bit rates > 128kbps

SirenTM22 on the Road to Standardization

ITU-T G.719 full-band codec approved in May 2008 Based on Polycom Siren™22 and Ericsson’s advanced audio G.719 number for higher visibility

ITU-T cited the strong and increasing demand for audio coding providing the full human auditory bandwidth Conferencing systems are increasingly used for more elaborate

presentations, often including music and sound effects In today’s multimedia presentations, playback of audio and

video from DVDs and PCs is becoming a common practice New Telepresence systems provide High Definition video and

audio quality to the user, and require high-quality media delivery to create the immersive experience

Extending the quality of remote meetings helps reduce travel which in turn reduces greenhouse gas emission and limits climate change.

10

Automatic Gain Control (AGC)

Activated by speech and music

Ignores white noise, e.g. if a fan is working close, AGC will not ramp up the gain based on fan noise

AGC destroys the natural dynamic range If the music is loud, AGC decreases the volume If the music is quiet, AGC increases the volume

Therefore, AGC must be completely disabled in a codec

12

Stereo Acoustic Echo Cancellation (AEC)

50-22,000 Hz operating range

Adaptive filter length of 260ms This number is the max delay of the echo that we can compensate This is the room response – it includes many audio wave reflections

No learning sequence needed Algorithm trains quickly on speech No need to send out white noise to train it

Stereo echo canceller identifies multiple paths of the stereo loudspeakers

Quickly adapts to microphones that are moved within two words of speech Moving the mike changes the echo path and the adaptive filter has to

learn the new path. Echo comes back for short time (1-2 words); then canceller adjusts.

15

Stereo AEC in Live Music Mode (LMM)

Standard AEC leads to audio artifacts, low notes can be cut

Main complain from MSM is that sustained note (e.g. press sustain pedal on piano) cannot be heard all the way even if they are just 1dB over the noise floor

AEC settings in LMM prevent very quiet musical sounds from being cut out

Assumption that LMM is set in a quiet environment without background noise

We changed the thresholds for signal detection to be more aggressive (low)

16

Installed Audio

17

Definition: rack-mounted systems that process all the audio in a conference room or large meeting room

Microphones SpeakersVideo System

DVD TelephonySoundStructure

Interworking: Installed Audio & Video Endpoints

SoundStructure adds 8/12/16 additional inputs/outputs

Digital connectivity with Polycom Video Endpoints Fully digital audio for better quality Bi-directional stereo between SoundStructure and HDX Full 22kHz stereo AEC compatible with Siren 22 audio codec Shared mute and volume control Auto-discovery between the devices – automatic configuration

1818

SoundStructure

HDX

Advanced Video Technology: High Definition

1919

Qua

lity

Qua

lity

BandwidthBandwidth384kbps 512kbps 1Mbps 6Mbps

480p

720p

352x288

704x480

1280x720

Advanced Video Technology: Camera Control

Res 1280x720p

50/60FPS

Aspect ratio 16:9

Pan +/- 100°

Tilt +20° to -30°

12x optical zoom

20

FECC

FE

CC

Advanced Video Technology: Far End Camera Control (FECC)

21

FECC

In H.323, FECC uses H.281 (binary data) over H.224 (frames)

RFC 4573, MIME Type Registration for RTP Payload Format for H.224

Advanced Video Technology: Multiple Streams

22

‘Live’ Stream

‘Presentation’ Stream

ITU-T Recommendation H.239

RFC 4796, SDP Content Attribute

RFC 4574, The SDP Label Attribute

RFC 3388, Grouping of Media Lines in SDP

RFC 4582, Binary Flow Control Protocol (BFCP)

RFC 4583, SDP Format for BFCP Streams

draft-even-xcon-pnc-01, Role Mgmt & Multiple Streams

Transmission Technology

23

SIP Domain

H.323 Domain

Audio Precedence in Codec Negotiation

24

Audio

Video

High priority

Bandwidth Standard Setting LMM Setting

> 1024 Siren22 Stereo 128 Siren22 Stereo 128

768 - 1024 Siren22 Stereo 96 Siren22 Stereo 128

512 - 768 Siren22 Stereo 96 Siren22 Stereo 128

384 - 512 Siren22 Stereo 96 Siren22 Stereo 128

256 - 384 Siren14 Stereo 48 Siren14 Stereo 48

Keeping Quality Up in Transmission

25

Video Error Concealment (PVEC)

IP Network

Lost Packet Recovery (LPR)

Video

Audio Video

Audio Video

LPR Definitions

LPR is a new method of error concealment for packet based networks that is based upon Forward Error Correction (FEC)

LPR constantly adjust the video bit rate to reduce the amount of loss in a packet based network

26

Lost Packet Recovery (LPR)

27

Video Encoder

EncryptionRTP

SenderLPR

Packetizer

LPR Recovery

Packet Generator

LPR DBA Mode

DecisionRTCP

RTCP

LPR Recovery

RTP Reordering

Buffer

LPR Regeneration Decryption

Video Decoder

LPR DBA Example

28

100%

74%

58%

72%72%64%

58%

70%77%

Down Speeding Up Speeding

X ms X ms X ms X ms X ms X ms X ms X ms X ms

Full BandwidthPacket loss 25%, FEC on

No packet loss, FEC off

Y ms

Bit rate drop 26%

Packet loss 15%

Bit rate drop 16%

Packet loss 4%, FEC on

Bit rate drop 5%

Bit rate increase e.g. 10%

Down Speeding

No packet loss, FEC off

58%

X ms

Technology Summary

Flexible Networking – H.323 and SIP

Advanced Audio Technologies Audio Compression Automatic Gain Control (AGC) Automatic Noise Suppression (ANS) and Noise Fill Stereo Acoustic Echo Cancellation (AEC)

Advanced Video Technology High Definition Camera Control Multiple Streams

Advanced Transmission Technology Lost Packet Recovery (LPR)

29

Live Music Mode Demo

[email protected]