C:/Documents and Settings/Shiwen Mao/My Documents/mydoc ...szm0001/ShiwenMao_dissertation.pdf ·...

198
Realtime Multimedia Transport using Multiple Paths DISSERTATION for the Degree of Doctor of Philosophy (Electrical Engineering) Shiwen Mao January 2004

Transcript of C:/Documents and Settings/Shiwen Mao/My Documents/mydoc ...szm0001/ShiwenMao_dissertation.pdf ·...

Page 1: C:/Documents and Settings/Shiwen Mao/My Documents/mydoc ...szm0001/ShiwenMao_dissertation.pdf · viii previous work, our scheme is more realistic, and is easier to implement. Depending

Realtime Multimedia Transport using MultiplePaths

D I S S E R T A T I O N

for the Degree of

Doctor of Philosophy (Electrical Engineering)

Shiwen Mao

January 2004

Page 2: C:/Documents and Settings/Shiwen Mao/My Documents/mydoc ...szm0001/ShiwenMao_dissertation.pdf · viii previous work, our scheme is more realistic, and is easier to implement. Depending

REALTIME MULTIMEDIA TRANSPORTUSING MULTIPLE PATHS

D I S S E R T A T I O N

Submitted in Partial Fulfillment

of the REQUIREMENTS for the

Degree of

DOCTOR OF PHILOSOPHY (Electrical Engineering)

at the

POLYTECHNIC UNIVERSITY

by

Shiwen Mao

January 2004

Approved:

Department Head

Date

Copy No.

Page 3: C:/Documents and Settings/Shiwen Mao/My Documents/mydoc ...szm0001/ShiwenMao_dissertation.pdf · viii previous work, our scheme is more realistic, and is easier to implement. Depending

ii

Approved by the Guidance Committee:

Major: Electrical Engineering

Shivendra S. PanwarProfessor ofElectrical and Computer Engineering

Date

Yao WangProfessor ofElectrical and Computer Engineering

Date

David J. GoodmanProfessor ofElectrical and Computer Engineering

Date

Minor: Computer Science

Keith W. RossProfessor ofComputer and Information Science

Date

Page 4: C:/Documents and Settings/Shiwen Mao/My Documents/mydoc ...szm0001/ShiwenMao_dissertation.pdf · viii previous work, our scheme is more realistic, and is easier to implement. Depending

iii

Microfilm or other copies of this dissertation are obtainable from:

UMI Dissertations Publishing

Bell & Howell Information and Learning

300 North Zeeb Road

P.O.Box 1346

Ann Arbor, Michigan 48106-1346

Page 5: C:/Documents and Settings/Shiwen Mao/My Documents/mydoc ...szm0001/ShiwenMao_dissertation.pdf · viii previous work, our scheme is more realistic, and is easier to implement. Depending

iv

VITA

Shiwen Mao received the B.S. degree in Electrical Engineering and the B.E.

degree in Enterprise Management from Tsinghua University, Beijing, P.R. China in

1994. He received the M.S. degree in Electrical Engineering from Tsinghua University

in 1997 and the M.S. degree in System Engineering from Polytechnic University,

Brooklyn, NY, in 2000. He is currently working toward the Ph.D. degree in Electrical

Engineering at the Polytechnic University.

He was a research assistant at the State Key Lab on Microwave and Digital

Communications, Beijing, P. R. China, from 1994 to 1995, working on SDH/SONET

system and ASIC design. He was a Research Member of the IBM China Research Lab,

Beijing, P. R. China, from 1997 to 1998, performing research on Java performance

study and web testing. In the summer of 2001, he was a research intern at the Avaya

Labs-Research, Holmdel, NJ. His research interests include multimedia transport in

the Internet and wireless networks, performance of wireless ad hoc or sensor networks,

queueing theory, and performance analysis.

Mr. Mao is a student member of IEEE, a student member of SIAM, and a

member of Tau Beta Pi.

Page 6: C:/Documents and Settings/Shiwen Mao/My Documents/mydoc ...szm0001/ShiwenMao_dissertation.pdf · viii previous work, our scheme is more realistic, and is easier to implement. Depending

v

To my wife Yihan, my son Eric, my parents, and my parents in law

for their love and support.

Page 7: C:/Documents and Settings/Shiwen Mao/My Documents/mydoc ...szm0001/ShiwenMao_dissertation.pdf · viii previous work, our scheme is more realistic, and is easier to implement. Depending

vi

ACKNOWLEDGEMENT

First, I would like to thank my thesis advisor, Prof. Shivendra S. Panwar,

who led me into this exciting area of networking research and worked with me closely

in the last five years, for his invaluable support and guidance throughout the course

of this dissertation. His extensive knowledge, enlightening direction, and continuous

encouragement made my thesis work smooth, positive, and enjoyable.

I am also indebted to other members of my dissertation committee. Prof.

Wang has closely guided my research during most of this thesis work. I am also

benefited very much from Prof. Ross and Prof. Goodman by constructive discussions,

and by following their excellent research work.

I want to acknowledge my fellow students and friends at Poly, including Dr.

George D. Lapiotis, Dr. Rajashi Roy, Prof. Thomas Y. Hou, Dr. Jeong-Tae Song,

Dr. Chaiwat Oottamakorn, Prof. Roberto Rojas-Cessa, Dr. Shunan Lin, Yuetang

Deng, Jeff Tao, Tao Li, Xuan Zheng, Yanming Shen, Pei Liu, Rakesh Kumar, Dr. Liji

Wu, Dr. Dennis Bushmitch, Sathya Narayanan, and many others. I thank them for

the discussions, cooperation, and assistance during these years. In addition, I want

to thank Mrs. Panwar for her hospitality. I really enjoy her Thanksgiving parties. I

would like to thank Prof. Malathi Veeraraghavan and Dr. John Zhao for the support

and inspiration in maintaining the EL537 lab and teaching the EL537 course, which

made the teaching experience as enjoyable as the research work. I am also very

grateful to Prof. Robert Boorstyn for teaching me queueing theory, Dr. Mark Karol

for kindly being my mentor in the summer of 2001, and Prof. Bhaskar Sengupta for

constructive comments on my GPS analysis work.

Finally, I want to thank my dear wife Yihan Li, my son Eric, my parents,

my parents in law, and my uncle Denial. Without their caring and support, the

achievements of my thesis are impossible.

This work is supported by the National Science Foundation under Grant ANI

0081375, the New York State Center for Advanced Technology in Telecommunications

(CATT) and the Wireless Internet Center for Advanced Technology (WICAT) at

Polytechnic University, Brooklyn, NY, USA.

Page 8: C:/Documents and Settings/Shiwen Mao/My Documents/mydoc ...szm0001/ShiwenMao_dissertation.pdf · viii previous work, our scheme is more realistic, and is easier to implement. Depending

vii

AN ABSTRACT

REALTIME MULTIMEDIA TRANSPORTUSING MULTIPLE PATHS

by

Shiwen Mao

Advisor: Shivendra S. Panwar

Submitted in Partial Fulfillment of the Requirements

for the Degree of Doctor of Philosophy (Electrical Engineering)

January 2004

Realtime multimedia transport usually has stringent bandwidth, delay, and

loss requirements, which is not supported in the current best-effort Internet. Video

packets may be dropped due to congestion in the network, or frequent link failures

typical in ad hoc networks. However, the mesh topology of these networks implies the

existence of multiple paths between two nodes. Multipath transport provides an extra

degree of freedom in designing error resilient video coding and transport schemes.

We propose to combine multistream coding with multipath transport for

video transport over ad hoc networks. We studied the performance of three schemes,

two proposed in our work, and one chosen from previous work, via extensive simu-

lations using the Markov channel models and OPNET. We also implemented an ad

hoc multipath video streaming testbed to further validate the advantages of these

schemes. The results show that great improvement in video quality can be achieved

over the standard schemes with limited additional cost.

We also present an analytical framework on optimal traffic partitioning for

multipath transport. We formulated a constrained optimization problem using deter-

ministic network calculus theory, and derived its closed form solution. Compared with

Page 9: C:/Documents and Settings/Shiwen Mao/My Documents/mydoc ...szm0001/ShiwenMao_dissertation.pdf · viii previous work, our scheme is more realistic, and is easier to implement. Depending

viii

previous work, our scheme is more realistic, and is easier to implement. Depending

on the system parameters, we can either achieve a minimum end-to-end delay equal

to the maximum fixed delay of the paths, or equalize the delays of all the paths.

We design a new protocol, called the Multi-flow Realtime Transport Pro-

tocol (MRTP), to support the realtime multimedia transport using multiple paths.

MRTP is a natural extension of the RTP/RTCP protocol for multiple paths, and is

complementary to SCTP in that it supports multimedia services. We present two

performance studies to demonstrate the benefits of using MRTP.

Finally, we study fundamental scheduling problem using the Generalized

Processor Sharing (GPS) discipline. We derived the effective bandwidth of a class,

and designed an admission control test. We also presented a tight service bound,

resulting in a higher bandwidth utilization than the previous work, and extended

previous work on Matrix Analytic Methods for stochastic flows to GPS analysis.

Page 10: C:/Documents and Settings/Shiwen Mao/My Documents/mydoc ...szm0001/ShiwenMao_dissertation.pdf · viii previous work, our scheme is more realistic, and is easier to implement. Depending

ix

Contents

List of Figures xii

List of Tables xvi

1 Introduction 11.1 Realtime Multimedia Transport . . . . . . . . . . . . . . . . . . . . . 11.2 The General Architecture . . . . . . . . . . . . . . . . . . . . . . . . 3

1.2.1 The General Architecture . . . . . . . . . . . . . . . . . . . . 31.2.2 Multistream Video Coding . . . . . . . . . . . . . . . . . . . 41.2.3 Multipath Transport . . . . . . . . . . . . . . . . . . . . . . . 6

1.3 Key Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81.4 Dissertation Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

2 Multipath Video Transport over Ad Hoc Networks 122.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122.2 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152.3 Proposed Video Transport Schemes . . . . . . . . . . . . . . . . . . . 17

2.3.1 Feedback Based Reference Picture Selection . . . . . . . . . . 182.3.2 Layered Coding with Selective ARQ . . . . . . . . . . . . . . 202.3.3 Multiple Description Motion Compensation . . . . . . . . . . 222.3.4 Comparison and Discussion . . . . . . . . . . . . . . . . . . . 23

2.4 Performance Study using Markov Models . . . . . . . . . . . . . . . 242.4.1 The Video Codec Implementations and Parameters . . . . . . 252.4.2 Modeling of Ad Hoc Routes using Markov Models . . . . . . . 262.4.3 Simulation Results using Markov Channel Models . . . . . . 27

2.5 Performance Study using OPNET Models . . . . . . . . . . . . . . . 312.5.1 Multipath Routing using Dynamic Source Routing . . . . . . 322.5.2 OPNET Simulation Setting . . . . . . . . . . . . . . . . . . . 332.5.3 Simulation Results using OPNET Models . . . . . . . . . . . 34

2.6 An Ad Hoc Multipath Video Testbed . . . . . . . . . . . . . . . . . 432.6.1 The Setup of the Testbed . . . . . . . . . . . . . . . . . . . . 432.6.2 Experimental Results . . . . . . . . . . . . . . . . . . . . . . . 44

2.7 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47

Page 11: C:/Documents and Settings/Shiwen Mao/My Documents/mydoc ...szm0001/ShiwenMao_dissertation.pdf · viii previous work, our scheme is more realistic, and is easier to implement. Depending

x

3 Optimal Traffic Partitioning 493.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 493.2 Problem Formulation . . . . . . . . . . . . . . . . . . . . . . . . . . 513.3 Optimal Partitioning with Two Paths . . . . . . . . . . . . . . . . . 54

3.3.1 Optimal Partitioning Based on the Busy Period Bound . . . . 543.3.2 The Optimal Partitioning with FCFS Queues . . . . . . . . . 58

3.4 The Optimal Partitioning with Multiple Paths . . . . . . . . . . . . 603.5 Practical Implications . . . . . . . . . . . . . . . . . . . . . . . . . . 67

3.5.1 Optimal Path Selection . . . . . . . . . . . . . . . . . . . . . 673.5.2 Enforcing the Optimal Partition . . . . . . . . . . . . . . . . 67

3.6 Numerical Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . 693.7 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 723.8 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77Appendix A: Proof of Theorem 1 . . . . . . . . . . . . . . . . . . . . . . . 78Appendix B: Proof of Theorem 2 . . . . . . . . . . . . . . . . . . . . . . . 85

4 The Multi-flow Realtime Transport Protocol 884.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 884.2 Background and Related Work . . . . . . . . . . . . . . . . . . . . . 91

4.2.1 Traffic Partitioning . . . . . . . . . . . . . . . . . . . . . . . . 914.2.2 Multi-stream Coding and Multipath Transport . . . . . . . . 93

4.3 The Multiflow Realtime Transport Protocol . . . . . . . . . . . . . . 944.3.1 MRTP Overview . . . . . . . . . . . . . . . . . . . . . . . . . 944.3.2 Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 954.3.3 Packet Formats . . . . . . . . . . . . . . . . . . . . . . . . . . 984.3.4 The Operations of MRTP/MRTCP . . . . . . . . . . . . . . . 1064.3.5 Usage Scenarios . . . . . . . . . . . . . . . . . . . . . . . . . 109

4.4 MRTP Performance Studies . . . . . . . . . . . . . . . . . . . . . . . 1124.4.1 The Impact of Traffic Partitioning . . . . . . . . . . . . . . . 1124.4.2 Video Transport over Ad Hoc Networks . . . . . . . . . . . . 116

4.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120

5 Analyzing a Generalized Processor Sharing System 1235.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1235.2 The System Model and Problem Statement . . . . . . . . . . . . . . 1285.3 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129

5.3.1 A Fluid Queue with MMFP Sources . . . . . . . . . . . . . . 1295.3.2 Output Characterization . . . . . . . . . . . . . . . . . . . . 1315.3.3 The LZT Service Bound . . . . . . . . . . . . . . . . . . . . . 1325.3.4 The Chernoff-Dominant Eigenvalue Approximation . . . . . . 133

5.4 The Effective Bandwidth of a MMFP Class . . . . . . . . . . . . . . 1345.4.1 Transforming the Decoupled System . . . . . . . . . . . . . . 134

Page 12: C:/Documents and Settings/Shiwen Mao/My Documents/mydoc ...szm0001/ShiwenMao_dissertation.pdf · viii previous work, our scheme is more realistic, and is easier to implement. Depending

xi

5.4.2 The Effective Bandwith of a MMFP Class . . . . . . . . . . . 1355.4.3 Numerical Investigations . . . . . . . . . . . . . . . . . . . . 137

5.5 A Tighter Service Bound . . . . . . . . . . . . . . . . . . . . . . . . 1405.5.1 The LMP Bound . . . . . . . . . . . . . . . . . . . . . . . . . 1405.5.2 Numerical Investigations . . . . . . . . . . . . . . . . . . . . 143

5.6 Matrix Analytic Methods for GPS Analysis . . . . . . . . . . . . . . 1445.6.1 Matrix Analytical Methods for Fluid Flow Analysis . . . . . . 1455.6.2 Computing the Rate Matrix . . . . . . . . . . . . . . . . . . . 1475.6.3 Caudal Characteristics of Fluid Queues . . . . . . . . . . . . 1485.6.4 Numerical Investigations . . . . . . . . . . . . . . . . . . . . 149

5.7 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153

6 Summary and Future Work 1556.1 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1556.2 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157

Bibliography 159

List of Publications 177

Acronyms 180

Page 13: C:/Documents and Settings/Shiwen Mao/My Documents/mydoc ...szm0001/ShiwenMao_dissertation.pdf · viii previous work, our scheme is more realistic, and is easier to implement. Depending

xii

List of Figures

1.1 The general architecture of using multiple paths for video transport. . 4

2.1 Illustration of the RPS scheme. The arrow associated with each frameindicates the reference used in coding that frame. . . . . . . . . . . . 19

2.2 A two-path layered video transmission model with end-to-end ARQ forBL packets. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

2.3 Illustration of the MDMC encoder. . . . . . . . . . . . . . . . . . . . 232.4 Average PSNRs of the three schemes with asymmetric paths: Path 1’s

loss rate is fixed at 12%, and path 2’s loss rate varies from 0.1% to 10%. 282.5 Average PSNRs of the three schemes with asymmetric paths: Path 1’s

loss rate is twice of that of path 2. . . . . . . . . . . . . . . . . . . . . 292.6 Average PSNRs of the three schemes with symmetric paths: The mean

burst length is fixed at 4 packets, while the loss rates varies. . . . . . 302.7 Average PSNRs of the three schemes with symmetric paths: The loss

rates are fixed at 10%, while the mean burst length varies. . . . . . . 312.8 The MDSR route updating algorithm. . . . . . . . . . . . . . . . . . 332.9 Simulation results of 16 nodes moving in a 600m × 600m region at a

speed of 10m/s. Plotted are the traces of two routes to the video sinkmaintained by the video source during the simulation. . . . . . . . . . 35

2.10 A zoom-in plot of Fig.2.9. . . . . . . . . . . . . . . . . . . . . . . . . 362.11 The PSNRs of the received frames with a MDMC codec using two

paths. 16 nodes move in a 600m× 600m region at a speed of 10m/s.Plotted on the right y axis are the lost packets per frame. The MSDRalgorithm is used for route updates. The measured average loss ratesof the two substreams are: (3.0%, 3.1%). . . . . . . . . . . . . . . . 37

2.12 The PSNRs of the received frames with a MDMC codec using a singlepath. 16 nodes move in a 600m × 600m region at a speed of 10m/s.Plotted on the right y axis are the lost packets per frame. The path isupdated using the the NIST DSR model and both substreams are senton the path using an interleaving interval of 2 frames. The measuredaverage loss rates of the two substreams are: (6.3%,6.4%). . . . . . . 38

Page 14: C:/Documents and Settings/Shiwen Mao/My Documents/mydoc ...szm0001/ShiwenMao_dissertation.pdf · viii previous work, our scheme is more realistic, and is easier to implement. Depending

xiii

2.13 Zoomed lost packet per frame traces of two substreams: (a) The MPTcase (corrsponding to Fig.2.11); (b) The SPT case (corresponding toFig.2.12). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39

2.14 The PSNRs of the received frames with a MDMC codec. 16 nodes ina 600m× 600m region. Plotted on the right y axis are the lost packetsper frame. The nodes are stationary. . . . . . . . . . . . . . . . . . . 40

2.15 Loss characteristics vs. mobile speed for both MPT and SPT OPNETsimlations: (a) Average packet loss rate; (b) Average error burst length. 41

2.16 The average PSNR vs. node speed for the MDMC scheme from theOPNET simulations. . . . . . . . . . . . . . . . . . . . . . . . . . . . 42

2.17 Experiment scenarios for the testbed: (a) Line-of-sight; (b) Behind thewalls. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43

2.18 A screenshot of the testbed GUI during a MDMC experiment. . . . . 48

3.1 A traffic partitioning model with two paths. . . . . . . . . . . . . . . 523.2 A deterministic traffic partitioning scheme. . . . . . . . . . . . . . . . 523.3 Determining the end-to-end delay Dl. . . . . . . . . . . . . . . . . . . 563.4 Illustration of a tighter delay bound. . . . . . . . . . . . . . . . . . . 593.5 Three regions determined by the system parameters. . . . . . . . . . 603.6 A traffic partitioning model with N paths. . . . . . . . . . . . . . . . 613.7 Problem P(N, σ): The case of σ ≤ σN

th. . . . . . . . . . . . . . . . . . 633.8 Problem P(N, σ): The case of σ > σN

th. . . . . . . . . . . . . . . . . . 633.9 Problem P(N − 1, σ): The case of σ ≤ σN−1

th . . . . . . . . . . . . . . . 653.10 Problem P(N − 1, σ): The case of σ > σN−1

th . . . . . . . . . . . . . . . 653.11 Computing the optimal partition. . . . . . . . . . . . . . . . . . . . . 663.12 Implementation of the optimal traffic partitioning scheme. . . . . . . 683.13 The minimum end-to-end delay, D∗

l : two paths, f1 = 1, f2 = 3, c1 = 2,c2 = 1. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70

3.14 The optimal burst assignment for path 1, σ∗1: two paths, f1 = 1, f2 = 3,

c1 = 2, c2 = 1. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 713.15 The minimum end-to-end delay, D∗

l : two paths, f1 = 1, f2 = 3, c1 = 2,c2 = 1. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72

3.16 The difference between two minimum end-to-end delays, i.e., D∗l − D∗

l :two paths, f1 = 1, f2 = 3, c1 = 2, c2 = 1. . . . . . . . . . . . . . . . . 73

3.17 The minimum end-to-end delay D∗l : five paths, f1 = 1, f2 = 2, f3 = 3,

f4 = 4, f5 = 5, c1 = 1, c2 = 1.5, c3 = 2, c4 = 2.5, c5 = 3. . . . . . . . . 743.18 The minimum end-to-end delay D∗

l : five paths, f1 = 1, f2 = 2, f3 = 3,f4 = 4, f5 = 5, c1 = 3, c2 = 2.5, c3 = 2, c4 = 1.5, c5 = 1. . . . . . . . . 75

3.19 The optimal burst assignment for path 1 σ∗1: five paths, f1 = 1, f2 = 2,

f3 = 3, f4 = 4, f5 = 5, c1 = 1, c2 = 1.5, c3 = 2, c4 = 2.5, c5 = 3. . . . 763.20 The optimal burst assignment for path 5 σ∗

5: five paths, f1 = 1, f2 = 2,f3 = 3, f4 = 4, f5 = 5, c1 = 1, c2 = 1.5, c3 = 2, c4 = 2.5, c5 = 3. . . . 77

Page 15: C:/Documents and Settings/Shiwen Mao/My Documents/mydoc ...szm0001/ShiwenMao_dissertation.pdf · viii previous work, our scheme is more realistic, and is easier to implement. Depending

xiv

3.21 The optimal rate assignment for path 1 ρ∗1: five paths, f1 = 1, f2 = 2,f3 = 3, f4 = 4, f5 = 5, c1 = 1, c2 = 1.5, c3 = 2, c4 = 2.5, c5 = 3. . . . 78

3.22 The optimal rate assignment for path 5 ρ∗5: five paths, f1 = 1, f2 = 2,f3 = 3, f4 = 4, f5 = 5, c1 = 1, c2 = 1.5, c3 = 2, c4 = 2.5, c5 = 3. . . . 79

3.23 The optimal burst assignment σ∗1: five paths, f1 = 1, f2 = 2, f3 = 3,

f4 = 4, f5 = 5, c1 = 3, c2 = 2.5, c3 = 2, c4 = 1.5, c5 = 1. . . . . . . . . 803.24 The optimal rate assignment ρ∗1: five paths, f1 = 1, f2 = 2, f3 = 3,

f4 = 4, f5 = 5, c1 = 3, c2 = 2.5, c3 = 2, c4 = 1.5, c5 = 1. . . . . . . . . 813.25 The highest index of the paths used: five paths, f1 = 1, f2 = 2, f3 = 3,

f4 = 4, f5 = 5, c1 = 1, c2 = 1.5, c3 = 2, c4 = 2.5, c5 = 3. . . . . . . . . 823.26 The highest index of the paths used: five paths, f1 = 1, f2 = 2, f3 = 3,

f4 = 4, f5 = 5, c1 = 3, c2 = 2.5, c3 = 2, c4 = 1.5, c5 = 1. . . . . . . . . 833.27 Different {σ, ρ} assignments give the same delay bound d. . . . . . . . 843.28 The delay curves with different system parameters. . . . . . . . . . . 853.29 The Dmin

1 curve and its relationship with f2. . . . . . . . . . . . . . . 863.30 The delay curves with different system parameters and the tighter delay

bound. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86

4.1 A video thinning example. . . . . . . . . . . . . . . . . . . . . . . . . 924.2 A video striping example. . . . . . . . . . . . . . . . . . . . . . . . . 934.3 A usage scenario of MRTP. . . . . . . . . . . . . . . . . . . . . . . . . 954.4 Positioning of MRTP/MRTCP in the TCP/IP protocol stack. . . . . 954.5 The MRTP data packet format. . . . . . . . . . . . . . . . . . . . . . 994.6 The MRTP Sender Report format. . . . . . . . . . . . . . . . . . . . 1014.7 The MRTP Hello Session message format. . . . . . . . . . . . . . . . 1034.8 The MRTP Bye Session message format. . . . . . . . . . . . . . . . . 1044.9 The MRTP extension header format. . . . . . . . . . . . . . . . . . . 1044.10 A daisy-chain of MRTP extension headers. . . . . . . . . . . . . . . . 1054.11 The operation of MRTP/MRTCP. . . . . . . . . . . . . . . . . . . . . 1064.12 Another two usage scenarios of MRTP, in addition to the one in Fig. 4.3.1104.13 The performance analysis model for section 4.4.1. . . . . . . . . . . . 1124.14 Variance V (S,m) with different aggregation level m. . . . . . . . . . 1154.15 Variance V (S,m) with different number of flows S. . . . . . . . . . . 1154.16 Buffer overflow probability of a queue fed by 100 video flows, with

S = 1, 2, and 4, repectively. . . . . . . . . . . . . . . . . . . . . . . . 1174.17 Performance improvement ratio Γ as a function of the thinning param-

eter S for different buffer sizes. . . . . . . . . . . . . . . . . . . . . . 1174.18 The occupancies of the resequencing buffers of two flows at the receiver

node. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1194.19 Comparison of MRTP and RTP: all nodes are stationary during the

simulation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1214.20 Comparison of MRTP and RTP: all nodes move at a speed of 6m/s. . 121

Page 16: C:/Documents and Settings/Shiwen Mao/My Documents/mydoc ...szm0001/ShiwenMao_dissertation.pdf · viii previous work, our scheme is more realistic, and is easier to implement. Depending

xv

5.1 A network access point, where admission control is performed and theuser traffic A(t) is policed to conform to the traffic specification. . . . 124

5.2 The system model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1295.3 The LZT bound model . . . . . . . . . . . . . . . . . . . . . . . . . . 1325.4 An equivalent model that in Fig. 5.3. . . . . . . . . . . . . . . . . . . 1355.5 Tail distributions of the three classes, each with 10 on-off sources and

c = 15.1. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1395.6 Tail distributions of the three classes, each with 30 on-off sources and

c = 41.4. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1405.7 Tail distributions of the three classes, each with 100 on-off sources and

c = 120.1. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1415.8 The queue decomposition technique. . . . . . . . . . . . . . . . . . . 1425.9 Comparative results of the LMP bound, the LZT bound, and simu-

lations in the case of 6 classes. The tail distributions of the logicalqueues are plotted. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143

5.10 The admissible region using a segregated bandwidth allocation. . . . 1455.11 Gains in number of admissible class 3 sources using the LMP bound

over the segregated system. . . . . . . . . . . . . . . . . . . . . . . . 1465.12 Tail distributions of a 3-queue GPS system. . . . . . . . . . . . . . . 1515.13 Tail distributions of a 3-queue GPS system, where class 1 has two video

sources and class 2 and 3 have 20 voice sources each. . . . . . . . . . 1525.14 Caudal Characteristics of class 1 queue in a three-queue GPS system

versus system load and with different GPS weights. . . . . . . . . . . 153

Page 17: C:/Documents and Settings/Shiwen Mao/My Documents/mydoc ...szm0001/ShiwenMao_dissertation.pdf · viii previous work, our scheme is more realistic, and is easier to implement. Depending

xvi

List of Tables

2.1 Comparison of the Three Schemes . . . . . . . . . . . . . . . . . . . . 252.2 Average PSNRs of Decoded Frames: MDMC Testbed Experiments and

Markov Simulations . . . . . . . . . . . . . . . . . . . . . . . . . . . . 452.3 Average PSNRs of Decoded Frames: LC with ARQ Testbed Experi-

ments and Markov Simulations . . . . . . . . . . . . . . . . . . . . . 46

3.1 Definition of the Variables Used in the Analysis . . . . . . . . . . . . 55

5.1 An Admission Control Test Based on (5.18). . . . . . . . . . . . . . . 1375.2 Source Parameters used in Figures 5.5, 5.6, and 5.7 . . . . . . . . . . 1385.3 GPS Weights of the Classes in Figures 5.5, 5.6, and 5.7 . . . . . . . . 1385.4 Slopes of G(x) in Fig.5.5 . . . . . . . . . . . . . . . . . . . . . . . . . 1385.5 Slope of G(x) for Class 3 and the Time Used to Computed the Tails

in Figures 5.5, 5.6, and 5.7. . . . . . . . . . . . . . . . . . . . . . . . 1395.6 Source Parameters Used in Fig. 5.9 . . . . . . . . . . . . . . . . . . . 1435.7 Source Parameters Used in Fig. 5.10 and Fig. 5.11 . . . . . . . . . . . 1445.8 On-off Source Parameters Used in Figures 5.12, 5.13, and 5.14. . . . . 1505.9 Video Source Parameters Used in Fig. 5.13 . . . . . . . . . . . . . . . 152

Page 18: C:/Documents and Settings/Shiwen Mao/My Documents/mydoc ...szm0001/ShiwenMao_dissertation.pdf · viii previous work, our scheme is more realistic, and is easier to implement. Depending

1

Chapter 1

Introduction

1.1 Realtime Multimedia Transport

Recent advances in computing technology, compression technology, high volume, high

bandwidth storage devices, and high-speed networking have made it feasible to pro-

vide realtime multimedia services over the Internet. In such services, multimedia data

is displayed continuously at the receiver side, which requires the network transport

to deliver the multimedia data in a timely fashion. Interactive video or stored video

form the predominant part of today’s realtime multimedia data. We will focus on

video transport in this dissertation.

Due to its realtime nature, video transport usually has stringent bandwidth,

delay, and loss requirements. Even though some packet loss is generally tolerable, the

quality of reconstructed video or audio will be impaired and errors will propagate to

consecutive frames because of the dependency introduced among frames belonging

to one group of pictures at the encoder [4]. However, the current best-effort Inter-

net does not offer any quality of service (QoS) guarantees for video transport. The

Transmission Control Protocol (TCP) is mainly designed for reliable data traffic. It

is not suitable for realtime multimedia data because

• The delay and jitter caused by TCP retransmissions may be intolerable.

• TCP slow-start and congestion avoidance may not be suitable for realtime mul-

timedia transport.

Page 19: C:/Documents and Settings/Shiwen Mao/My Documents/mydoc ...szm0001/ShiwenMao_dissertation.pdf · viii previous work, our scheme is more realistic, and is easier to implement. Depending

2

• TCP does not support multicast.

Thus the User Datagram Protocol (UDP) is typically used in almost all realtime

multimedia applications. UDP only extends the best-effort, host-to-host IP service

to the process-to-process level. When congestion occurs, an unlimited amount of

UDP datagrams may be dropped since UDP is non-adaptive. Realtime multimedia

applications must implement rate control and error control to cope with network

congestion.

With the recent advances in wireless technologies, wireless networks are

becoming a significant part of today’s access networks. Ad hoc networks are wireless

mobile networks without an infrastructure. Since no pre-installed base stations are

required, ad hoc networks can be deployed quickly at conventions, disaster recovery

areas, and battlefields. When deployed, mobile nodes cooperate with each other to

find routes and relay packets for each other. To support video service in the Internet,

it must be provided in the wireless networks (e.g., ad hoc networks), since end users

may be connected in such networks.

It is a great challenge to provide video services in ad hoc networks. A wireless

link usually has high transmission error rate because of shadowing, fading, path loss,

and interference from other transmitting users. An end-to-end path found in ad hoc

networks has an even higher error rate since it is the concatenation of multiple wireless

links. Moreover, user mobility makes the network topology constantly change. In

addition to user mobility, ad hoc networks need to reconfigure themselves when users

join and leave the network. In ad hoc networks, an end-to-end route may only exist

for a short period of time. The frequent link failures and route changes cause packet

losses and reduce the received video quality.

Therefore, there are mainly two types of transmission losses in the network:

packet loss caused by congestion and buffer overflow, and packet losses caused by

link failures. The first type of packet loss is dominant in wireline networks, while

the second type of packet loss is more frequent in ad hoc networks. To provide

an acceptable received video quality, there should be effective error control to reduce

these two types of packet losses to a certain level. Traditional error control techniques,

Page 20: C:/Documents and Settings/Shiwen Mao/My Documents/mydoc ...szm0001/ShiwenMao_dissertation.pdf · viii previous work, our scheme is more realistic, and is easier to implement. Depending

3

including Forward Error Correction (FEC) and Automatic Repeat Request (ARQ),

must be adapted to take into consideration of frequent congestion and link failures.

One common feature of the wireline and wireless ad hoc networks is that

both have a mesh topology, which implies the existence of multiple paths between two

nodes. If we use multiple paths for a video session, the video stream can be divided

into multiple substreams and each substream is sent on one of the paths. Thus the

traffic is more evenly distributed in the network and congestion is less likely to occur.

The packet loss due to congestion can therefore be greatly reduced. Furthermore, if

these paths are disjoint, the losses experienced by the substreams would be relatively

independent. Therefore, better error resilience can be achieved when traffic dispersion

is performed appropriately and with effective error control for the substreams. Indeed,

multipath transport (MPT) provides an extra degree of freedom in designing video

coding and error control schemes.

1.2 The General Architecture

1.2.1 The General Architecture

The general architecture of realtime mutimedia transport using multiple paths is

given in Fig.1.1. In the architecture, a multipath routing layer sets up K paths

between the source and destination, each with a set of QoS parameters in terms of

bandwidth, delay, and loss probabilities. The transport layer continuously monitors

path QoS parameters and returns such information to the sender. Based on the

path quality information, the encoder generates M substreams. The traffic allocator

disperses packets from the substreams among the K paths. On the receiver side,

packets arriving from all the paths are put into a resequencing buffer where they are

reassembled into M substreams after a preset playout delay. Some or all the packets

assigned to a path may be lost or overdue1. Limited retransmission of lost packets

may or may not be invoked, depending on the encoding scheme and the end-to-end

delay constraint. The decoder will attempt to reconstruct a video sequence from the

1Overdue packets are regarded as lost.

Page 21: C:/Documents and Settings/Shiwen Mao/My Documents/mydoc ...szm0001/ShiwenMao_dissertation.pdf · viii previous work, our scheme is more realistic, and is easier to implement. Depending

4

MultipathRouting

StreamEncoder

Multi−

Reseq−encingBuffer

Stream1

Stream2

StreamM

...

Stream1

Stream2

StreamM

...

MultipathRouting

TrafficAllocator ...

Path2

Path1

PathK

Path1

Path2

PathK

...

End−to−end Feedback

ReceiverSender Network

Video

Source

DisplayMulti−StreamDecoderStreamMulti−

Figure 1.1: The general architecture of using multiple paths for video transport.

received substreams.

There are four essential components in this architecture, i.e., multistream

coding, multpath routing, traffic partitioning, and resequencing. A key to the success

of the proposed system is the close interaction between these components, which

entails careful cross-layer design. We will highlight this interaction in the following

discussion.

1.2.2 Multistream Video Coding

For MPT to be helpful for sending compressed video, the video coder must be carefully

designed to generate substreams so that the loss in one substream does not adversely

affect the decoding of other substreams. However, this relative independence between

the substreams should not be obtained at the expense of a significantly decrease in

coding efficiency. Therefore, the multistream encoder should strive to achieve a good

trade-off between coding efficiency and error resilience. In addition, one must consider

what is feasible in terms of transport layer error control, when designing the source

coder.

Obviously, one way to generate multiple substreams is to use a standard

video codec and split the resulting bitstream into multiple substreams. An intelligent

Page 22: C:/Documents and Settings/Shiwen Mao/My Documents/mydoc ...szm0001/ShiwenMao_dissertation.pdf · viii previous work, our scheme is more realistic, and is easier to implement. Depending

5

splitting scheme is needed to split the bit stream at the boundary of independently

decodable units. Otherwise a lost substream will make the received ones from other

paths useless. A simple way to accomplish this is to send the frames to the paths in

a round robin manner, e.g., all odd frames are sent to path 1 and all even frames are

sent to path 2. In order to completely avoid the dependency between sub-streams,

the frames sent on one path should be predictively coded with respect to the frames

on the same path only. This method is in fact an option available in the H.263+

standard (Video Redundancy Coding (VRC)) [12]. However, compared to predicting

a frame from its immediate neighbor, VRC requires significantly higher bit rates.

Also, although this method can prevent the loss in one path from affecting frames

in the other path, error propagation still exists within frames in the same path. In

Chapter 2, we introduce a feedback based reference picture selection method, which

circumvents these two problems of VRC.

Another natural way of generating multiple streams is by using layered video

coding, which is very useful in coping with the heterogeneity of user access rates,

in network link capacities, and in link reliability. A layered coder encodes video

into several layers. The base layer (BL), which includes the crucial information in

the video frames, guarantees a basic display quality. Each enhancement layer (EL)

correctly received improves the video quality. But without the BL, video frames

cannot be reconstructed. Usually, EL packets may be dropped at a congested node

to protect BL packets, and BL packets are better protected with FEC or ARQ [13].

When combined with MPT, it is desirable to transmit the BL substream on the

“best” route. The source may sort the paths according to their loss characteristics,

inferred from QoS feedback (e.g., receiver reports in the Realtime Transport Protocol

(RTP) [14]). Alternatively, the multipath routing layer may organize the route cache

according to some performance metrics (number of hops, mean loss rate in the last

time window, etc.). In Chapter 2, we consider an approach that protects the base-

layer by retransmitting lost BL packets on the path carrying the EL packets.

Instead of generating substreams that are unequal in their importance, Mul-

tiple Description Coding (MDC) generates multiple equally important streams, each

giving a low but acceptable quality. A high-quality reconstruction is decodable from

Page 23: C:/Documents and Settings/Shiwen Mao/My Documents/mydoc ...szm0001/ShiwenMao_dissertation.pdf · viii previous work, our scheme is more realistic, and is easier to implement. Depending

6

all bit streams together, while a lower, but still acceptable quality reconstruction is

achievable if only one stream is received. The correlation among the substreams in-

troduced at the encoder makes it possible to partially recover lost information of one

substream, using information carried in other correctly received substreams. How-

ever, such a correlation limits the achievable coding efficiency, as compared to a

conventional coder designed to maximize it. An excellent review of the theoretical

bounds and proposed MDC algorithms can be found in [15]. In designing a MCP-

based multiple description (MD) video codec, a key challenge is how to control the

mismatch between the reference frames used in the encoder and those used in the

decoder caused by transmission errors. Among several MD video coding schemes

proposed so far [9][12][16][17], we chose the MDMC method (section 2.3.3), because

it outperformed other MD video coding methods in our previous studies [9]. With

MDC, the transport layer design can be simpler than with layered coding. Because all

the descriptions are equally important, the transport layer does not need to protect

one stream more than another. Also, because each description alone can provide a

low but acceptable quality, no retransmission is required, making MDC more suitable

for applications with stringent delay requirements.

1.2.3 Multipath Transport

MPT has been studied in the past in wireline networks for (i) increased aggregate

capacity, (ii) better load balancing, and (iii) path redundancy for failure recovery

[19]-[21]. The research effort in this area can be roughly divided into the following

two categories:

1. Multi-path Routing, which focuses on finding multiple routes for a source-

destination pair, and on how to select a maximally disjoint set of routes from

the multiple routes found [22]-[25];

2. Traffic Dispersion, which focuses on how to allocate traffic to multiple end-to-

end routes [26][27]. Generally traffic dispersion can be performed with different

granularities. Ref. [28] is an excellent survey on this topic.

Page 24: C:/Documents and Settings/Shiwen Mao/My Documents/mydoc ...szm0001/ShiwenMao_dissertation.pdf · viii previous work, our scheme is more realistic, and is easier to implement. Depending

7

The particular communication environment of wireless ad hoc networks

makes MPT very appealing. In ad hoc networks, (i) Individual links may not have

adequate capacity to support a high bandwidth service2; (ii) A high loss rate is typ-

ical; and (iii) Links are unreliable. MPT can provide larger aggregate bandwidth

and load balancing for video applications. In addition, the path diversity inherent

in MPT can provide better error resilience performance. Furthermore, many of the

ad hoc network routing protocols, e.g., DSR [29], AODV [30], and ZRP [31], are

able to return multiple paths in response to a route query. Multipath routing can be

implemented by extending these protocols with limited additional complexity.

There are many issues that should be addressed in supporting MPT in wire-

line or ad hoc networks. First, the benefit of MPT is maximized by using a set of

maximally disjoint paths. Shared links (or nearby links in ad hoc networks) could

make the loss processes of the substreams correlated, which reduces the benefit of

using MPT [32]. Algorithms for finding disjoint paths are presented in [22][23]. Sec-

ond, finding and maintaining multiple paths requires higher complexity and may

cause additional overhead on traffic load (e.g., more route replies received). However,

caching multiple routes to any destination allows prompt reaction to route changes.

If a backup path is found in the cache, there is no need to send new route queries.

Rerouting delay and routing overhead may be reduced in this case. These problems

should be addressed carefully in designing multi-path transport protocols to balance

the benefits and costs. Third, a problem inherent in MPT is the additional delay

and complexity in packet resequencing. Previous work shows that resequencing delay

and buffer requirement are moderate if the traffic allocator in Fig.1.1 is carefully de-

signed [33][34]. We will present an analytical framework on the traffic allocator and

resequencing buffer in Chapter 3.

2Although in some cases the nominal bandwidth of a wireless link is comparable to that ofa wireline link, the available bandwidth may vary with signal strength as in IEEE 802.11b. Inaddition, capacity lost due to protocol overhead in ad hoc networks is much higher than that inwireline networks (e.g., RTS, CTS, ACK packets, and the 30-byte frame header in IEEE 802.11b)

Page 25: C:/Documents and Settings/Shiwen Mao/My Documents/mydoc ...szm0001/ShiwenMao_dissertation.pdf · viii previous work, our scheme is more realistic, and is easier to implement. Depending

8

1.3 Key Contributions

In this dissertation, we address the problem of enabling multimedia transport using

multiple paths. The key contributions are as follows.

First, we investigated supporting video service over wireless mobile ad hoc

networks. This is a very difficult problem because ad hoc paths are ephemeral and

video quality is susceptible to transmission losses. We proposed two novel video

coding and transport schemes based on reference picture selection and layered coding,

respectively, and chose a MDC based scheme from previous work in our studies.

These three schemes are representative and all based on the Motion Compensated

Prediction (MCP) technique used in all the modern video coding standards. We

demonstrated methods to adapt these schemes to MPT, and studied the performance

of the three proposed schemes using the Markov model simulations and the OPNET

simulations. Our results show that the use of multiple paths provides a powerful

means of combating transmission errors. In addition, multistream coding provides a

novel means of traffic partitioning, where redundancy can be introduced in an efficient

and well controlled manner.

To our knowledge, this is the first work on video transport over wireless

mobile ad hoc networks using multiple paths, which presents both simulation and

experimental results in a self-contained manner. Our results show that if a feedback

channel is available, the standard H.263 coder with its Reference Picture Selection

(RPS) option can work quite well, and that if delay caused by one retransmission is

acceptable, then layered coding is more suitable. Multiple description coding is the

choice when a feedback channel is not feasible (e.g., due to large end-to-end delay) or

when the loss rates on the paths are not too high.

We extended the Dynamic Source Routing (DSR) protocol for wireless mo-

bile ad hoc networks to multiple path routing. The new protocol, called the Multi-

path Dynamic Source Routing (MDSR) protocol, can effectively maintain two or more

shortest, and maximally node-disjoint paths for a video session. MDSR has a lower

routing delay than previous work [23] since it uses a greedy route update algorithm.

We implemented MDSR using OPNET. In addition, we also implemented MDSR on

Page 26: C:/Documents and Settings/Shiwen Mao/My Documents/mydoc ...szm0001/ShiwenMao_dissertation.pdf · viii previous work, our scheme is more realistic, and is easier to implement. Depending

9

the Microsoft Windows platform with the help of two undergraduate students. These

implementations allow us to study the performance of the proposed schemes under a

realistic setting.

To further validate the feasibility, as well as to demonstrate the benefits, of

using these proposed schemes, we implemented an ad hoc multipath video streaming

testbed and performed extensive experiments. The testbed results show that video

transport over ad hoc networks is viable for both the LC with ARQ and MDMC

schemes in the settings we examined. So far as we know, the testbed we developed is

the first testbed that combines multistream video coding and MPT for video transport

in ad hoc networks.

Second, we presented an analytical framework for the traffic allocator and

the resequencing buffer. We formulated the optimal traffic partitioning problem as

a constrained optimization problem using deterministic network calculus theory, and

derived its closed form solution. Compared with previous work, our scheme is easier

to implement and enforce. We derived the closed-form solution for a multiple paths

session problem, and makes the path selection problem easy as compared to previous

methods. Our results show that depending on the parameters of the paths and the

source flow, we can either achieve a minimum end-to-end delay equal to the maximum

fixed delay among all the paths, or equalize the delay of all the paths, by using the

optimal traffic partitioning. The resequencing buffer is also minimized.

Third, we designed a new application layer protocol to support multimedia

transport using multiple paths. The proposed protocol, called the Multi-flow Real-

time Transport Protocol (MRTP), is a natural extension of the RTP/RTCP protocol

to multiple paths, and is complementary to Stream Control Transmission Protocol

(SCTP) in that it supports multimedia services. We present two performance stud-

ies of the proposed protocol. First we studied the effect of traffic partitioning on the

queueing performance of the multimedia flows, using the large deviation theory of the

Bahadur-Rao asymptotics. The results show that traffic partitioning can effectively

reduce the short term autocorrelations of the flows, thus improve their queueing per-

formance. We also compared MRTP with RTP by simulating a wireless mobile ad

hoc network with a video session using OPNET. MRTP outperforms RTP in all the

Page 27: C:/Documents and Settings/Shiwen Mao/My Documents/mydoc ...szm0001/ShiwenMao_dissertation.pdf · viii previous work, our scheme is more realistic, and is easier to implement. Depending

10

cases we examined.

Fourth, we studied a fundamental scheduling problem using the Generalized

Processor Sharing (GPS) discipline. If there is QoS support inside the network, the

proposed architecture of multimedia transport using multiple paths can have an even

better performance. We analyzed a network node modeled as a multiple class GPS

system, where each class is modeled as a Markov Modulated Fluid Process (MMFP).

We derived the effective bandwidth of a MMFP class in the GPS system, and designed

an admission control test based on the analysis. We also presented a tight service

bound, resulting in a more accurate analysis of the tail distribution of the classes and

higher bandwidth utilization than the previous work [100]. Finally, we extended the

previous work on Matrix Analytic Methods for stochastic flows to GPS analysis.

1.4 Dissertation Outline

The general architecture on realtime multimedia transport using multiple paths is

presented in this chapter. The rest of the dissertation is organized as follows.

We present the study on enabling video transport over wireless mobile ad

hoc networks in Chapter 2. We choose three representative video coding schemes (all

based on the MCP technique used in all the modern video coding standards), and

show how to adapt these schemes to MPT. We study the performance of the three

proposed schemes using Markov model and OPNET simulations. To further validate

the feasibility, as well as to demonstrate the benefits of using these proposed schemes,

we implement an ad hoc multipath video streaming testbed, and report the extensive

experiments performed.

In Chapter 3, we present an analytical framework on the remaining two key

components in the proposed architecture, i.e., the traffic allocator and the resequenc-

ing buffer. We model the paths as a combination of a queue with a constant service

rate and a fixed delay line. We also assume the multimedia flow is regulated by a

leaky bucket, and is partitioned with a deterministic splitting scheme. With this

frame work, we formulate a constrained optimization problem and derive its closed

form solution. Our results apply to the multiple paths cases, and provide an easy

Page 28: C:/Documents and Settings/Shiwen Mao/My Documents/mydoc ...szm0001/ShiwenMao_dissertation.pdf · viii previous work, our scheme is more realistic, and is easier to implement. Depending

11

means for path selection.

In Chapter 4, we present a new application layer protocol to support the

multimedia transport using multiple paths. In addition to the motivation and detailed

specifications, we also present two performance studies of the proposed protocol,

which illustrate the benefits of using the proposed protocol.

In Chapter 5, we analyze a network node (e.g., a router or a switch) using

the GPS scheduling discipline to provide QoS guarantees for multimedia flows. We

model the node as a multiple class GPS system, where each class is modeled as a

MMFP. Our results can be used in admission control, by system designers for buffer

sizing, and by network administrators for setting network node parameters.

We present our conclusions and future research directions in Chapter 6.

Page 29: C:/Documents and Settings/Shiwen Mao/My Documents/mydoc ...szm0001/ShiwenMao_dissertation.pdf · viii previous work, our scheme is more realistic, and is easier to implement. Depending

12

Chapter 2

Multipath Video Transport over Ad Hoc Networks

2.1 Motivation

Ad hoc networks are multi-hop wireless networks without a pre-installed infrastruc-

ture. They can be deployed instantly in situations where infrastructure is unavailable

(e.g., disaster recovery), or where infrastructure is difficult to install (e.g., battle-

fields). It is maturing as a means to provide ubiquitous untethered communication.

With the increase both in the bandwidth of wireless channels and in the computing

power of mobile devices, it is expected that video service will be offered over ad hoc

networks in the near future.

Ad hoc networks pose a great challenge to video transport. There is no

fixed infrastructure and the topology is frequently changing due to node mobility.

Therefore, links are continuously established and broken. The availability and qual-

ity of a link further fluctuates due to channel fading and interference from other

transmitting users. In addition, an end-to-end path consists of a number of wire-

less links. Thus transmission loss in ad hoc networks is more frequent than that

in wireless networks with single hop wireless paths connecting nodes to the wireline

infrastructure. The most popular Media Access Control (MAC) scheme, the Carrier

Sensing Multiple Access/Collision Avoidance (CSMA/CA) scheme [1], is designed for

best-effort data. It provides no hard guarantees for a session’s bandwidth and delay.

Page 30: C:/Documents and Settings/Shiwen Mao/My Documents/mydoc ...szm0001/ShiwenMao_dissertation.pdf · viii previous work, our scheme is more realistic, and is easier to implement. Depending

13

Although bandwidth reservation is possible with MAC schemes based on Time Divi-

sion Multiple Access (TDMA) or Code Division Multiple Access (CDMA), practical

implementations of these schemes are non-trivial because of the synchronization or

code assignment problems when node mobility is allowed [2].

Video transport typically requires stringent bandwidth and delay guaran-

tees. However, it is very hard to maintain an end-to-end route which is both stable

and has enough bandwidth in an ad hoc network. Furthermore, compressed video

is susceptible to transmission errors. For example, a single bit error often causes

a loss of synchronization when Variable Length Coding (VLC) is used. Moreover,

the motion compensated prediction (MCP) technique is widely used in modern video

coding standards. In MCP, a frame is first predicted from a previous coded frame

(called reference picture) and then the prediction error is encoded and transmitted.

Although MCP achieves high coding efficiency by exploiting the temporal correlation

between adjacent frames, it makes the reconstruction of a frame depend on the suc-

cessful reconstruction of its reference picture. Without effective error protection and

concealment, a lost packet in a frame can cause not only error within this frame, but

also errors in many following frames, even when all the following frames are correctly

received [3].

Given the error-prone nature of ad hoc network paths and the susceptibility

of compressed video to transmission errors, effective error control is needed. Tradi-

tional techniques, including Forward Error Correction (FEC) and Automatic Repeat

Request (ARQ), must be adapted to take into consideration the delay constraint and

the error propagation problem [4]. In ad hoc networks, wireless links break down the

traditional concept of topology, which is not constrained by physical cable connec-

tions anymore. Although user mobility makes links volatile, it provides variability of

topology. On the one hand, a link may break when nodes move away from each other.

On the other hand, it is possible to quickly find new routes formed in a new topology.

Furthermore, the mesh topology of ad hoc networks implies the existence of multiple

routes between two nodes. Given multiple paths, a video stream can be divided into

multiple substreams and each substream is sent on one of the paths. If these paths are

disjoint, the losses experienced by the substreams would be relatively independent.

Page 31: C:/Documents and Settings/Shiwen Mao/My Documents/mydoc ...szm0001/ShiwenMao_dissertation.pdf · viii previous work, our scheme is more realistic, and is easier to implement. Depending

14

Therefore, better error resilience can be achieved when traffic dispersion is performed

appropriately and with effective error control for the substreams. In a manner simi-

lar to multi-antenna diversity that improves the capacity of wireless networks, path

diversity can also be exploited to improve the capacity of ad hoc networks. Indeed,

multipath transport (MPT) provides an extra degree of freedom in designing video

coding and transport schemes.

In this chapter, we propose three MCP-based video transport techniques

for mobile ad hoc networks. These schemes take advantage of path diversity to

achieve better performance. Compared to the coding methods considered in our

previous work [5], these techniques can achieve significantly higher coding efficiency

and are compliant with the H.26x and MPEG series standards (possibly with simple

modifications). The techniques that we have examined include:

1. A feedback based reference picture selection scheme (RPS) [6];

2. A layered coding (LC) with selective ARQ scheme (LC with ARQ) [7];

3. A multiple description motion compensation coding scheme (MDMC) [9].

We studied the performance of these three schemes via a top-down approach.

First we used a popular Markov link model [6][10], where lower layer detail is embodied

in the bursty errors generated. This simple model enables us to examine the system

performance over a wide range of pack loss rates and loss patterns. Next, lower layer

details, including user mobility, multipath routing, and the MAC layer are taken into

account in the OPNET simulations [11], which provide a more realistic view of the

impact of these factors on the system performance. Furthermore, we implemented an

ad hoc video streaming testbed using notebook computers with IEEE 802.11b cards.

This further validates the viability and performance advantages of these schemes.

The results of our experiments show that video transport is viable in ad hoc networks

given careful cross-layer design. Combining multistream coding with MPT improves

video quality, as compared to traditional schemes where a single path is used. Each

of these three techniques is best suited for a particular environment, depending on

Page 32: C:/Documents and Settings/Shiwen Mao/My Documents/mydoc ...szm0001/ShiwenMao_dissertation.pdf · viii previous work, our scheme is more realistic, and is easier to implement. Depending

15

the availability of feedback channels, the end-to-end delay constraint, and the error

characteristics of the paths.

The rest of the chapter is organized as follows. In Section 2.2, we discuss

related work and the contribution of this chapter. Next, three multistream coding

and MPT schemes are discussed in Section 2.3. Section 2.4 and 2.5 present the

performance study of these schemes using Markov models and OPNET Modeler, re-

spectively. Our experimental results with an ad hoc network video streaming testbed

are reported in Section 2.6. Section 2.7 provides some discussion and conclusions.

2.2 Related Work

Due to the availability of a variety of network access technologies, as well as the

reduction in their costs, there is strong interest in taking advantage of multi-homed

hosts to get increased aggregate bandwidth and higher reliability. Proposals in the

transport layer include [21][35]-[75]. In [35], a protocol called Meta-TCP maintaining

multiple TCP connections for a session was designed for data transport. The Stream

Control Transmission Protocol (SCTP) [21] was initially designed for reliable delivery

of signaling messages in IP networks using path redundancy. There are now proposals

to adapt it for data traffic in the Internet and in wireless networks [36][75]. These

papers focus on the higher aggregate usable bandwidth obtained and on how to

perform TCP congestion control over multiple paths. Multi-flow management can

also be carried out at the application layer. In [5] and [37], an extension of the

Realtime Transport Protocol (RTP) [14], called Meta-RTP, was proposed. Meta-

RTP sits on top of RTP in the protocol stack, performing traffic allocation at the

sender and resequencing at the receiver for real-time sessions.

Recently, several interesting proposals on delivering audio and video over

Internet and wireless networks using multiple paths have been introduced. The study

in [5][37] was, to the best of our knowledge, the first to investigate image and video

transport using MPT in a multihop wireless radio network. Although it provided

some very useful insights, the coders considered there treated individual frames of

a video sequence independently, and consequently are not very efficient. There are

Page 33: C:/Documents and Settings/Shiwen Mao/My Documents/mydoc ...szm0001/ShiwenMao_dissertation.pdf · viii previous work, our scheme is more realistic, and is easier to implement. Depending

16

several interesting papers on applying MPT for Internet multimedia streaming. In

[38], MDC is combined with path diversity for video streaming in the Internet. A

four-state model is proposed to capture the distortion behavior of a MD source. The

problem of MD video downloading in Content Delivery Networks (CDN) using a

number of servers is studied in [39]. It is reported that 20% to 40% reductions in

distortion can be achieved by using this many-to-one approach. Similarly, it is shown

in [40] that using multiple senders and FEC in data downloading effectively reduces

packet loss rates. An interesting study on realtime multistream voice communication

through disjoint paths is given in [41], where multiple redundant descriptions of a

voice stream are sent over paths provided by different Internet Service Providers.

Both significant reductions in end-to-end latency and packet loss rate are observed.

In recent work [42], the RPS scheme in [6] was extended by using rate-distortion

optimized long memory reference picture selection, and using probes for path status

prediction.

This chapter differs from previous work discussed thus far in many aspects.

First, we focus on video transport, as compared to general elastic data transport using

TCP [21][35]-[75]. We perform traffic partitioning in the application layer and use

UDP in the transport layer. We perform traffic dispersion on the substream level for

the results shown in this chapter1. Compared with Meta-RTP [5][37], our transport

schemes require no integrity in each substream and are more flexible. Second, we

study multipath video transport in ad hoc networks, while prior work focuses on

Internet video streaming [38]-[42]. It is much more challenging to transport video in

ad hoc networks than in a wireline network, e.g., the Internet, as discussed in Section

I. Moreover, there is the well-known assumption that all packet losses in the Internet

are caused by congestion [43], and MPT is mainly used in the Internet to alleviate

congestion (i.e., load balancing). In an ad hoc network, in addition to congestion in

mobile nodes, packets are also lost because the wireless links are unreliable. Therefore,

the benefit of using MPT, in addition to load balancing, is error resilience through

path diversity. Since the up and down status of the paths are relatively independent

1Finer packet-level traffic dispersion schemes can be supported by our schemes when more thantwo paths are available.

Page 34: C:/Documents and Settings/Shiwen Mao/My Documents/mydoc ...szm0001/ShiwenMao_dissertation.pdf · viii previous work, our scheme is more realistic, and is easier to implement. Depending

17

of each other, it is possible to apply efficient error control exploiting this feature to

improve video quality. Third, the multistream video coding schemes we proposed are

all MCP-based with high coding efficiency and are more compliant with modern video

coding standards, as compared with [5][37]. MDMC is a new multiple description

video coding technique and Ref. [16] describes its algorithm and its performance

using abstract channel models. In this chapter, we study the performance of MDMC

in ad hoc networks and performed extensive performance studies of MDMC under

a more realistic network setting. Fourth, we extend the Dynamic Source Routing

(DSR) protocol [29] to support multiple path routing. With our extension, multiple

maximally disjoint routes are selected from all the routes returned by a route query,

with only limited increase in the routing overhead. Fifth, for performance evaluation

we adopt a realistic model which includes all the layers except the physical layer using

OPNET Modeler [11]. We believe this cross-layer model provides a realistic view for

video transport over ad hoc networks.

In terms of implementation work, a number of ad hoc testbeds have been

built recently [44][45]. These mainly focus on the performance of ad hoc routing

protocols, physical layer characteristics, scalability issues, and integration of ad hoc

networks with the Internet for data transport. In [46], a firewall is inserted between

the source and destination, which drops video packets according to a Markov channel

model [10]. So far as we know, the testbed we developed is the first effort in combining

multistream video coding and MPT for video transport in ad hoc networks.

2.3 Proposed Video Transport Schemes

One of the challenges when utilizing path diversity for video transmission is how to

generate multiple coded substreams to feed the multiple paths. We consider three

types of coding schemes that differ in terms of their requirements for the transport-

layer support. These three schemes are all built on top of the block-based hybrid

coding framework using MCP and discrete cosine transform (DCT), which is employed

by all existing video coding standards. This way, the loss of coding efficiency is limited

and the source codec can be implemented by introducing minimal modifications to

Page 35: C:/Documents and Settings/Shiwen Mao/My Documents/mydoc ...szm0001/ShiwenMao_dissertation.pdf · viii previous work, our scheme is more realistic, and is easier to implement. Depending

18

codecs following existing standards. We present these three methods separately in

the subsequent subsections, followed by comparison and discussion.

2.3.1 Feedback Based Reference Picture Selection

As discussed above, one of the main challenges in MCP-based video coding for ad

hoc networks is how to limit the extent of error propagation caused by loss on a bad

path, and yet minimize the loss in coding efficiency. As mentioned in Section 1.2.2,

one simple approach to generate two substreams is the VRC option in the H.263+

standard [12], which codes the even and odd frames as two separate substreams and

perform temporal prediction within each substream. However, compared to predicting

a frame from its immediate neighbor, VRC requires significantly higher bit rates.

Also, although this method can prevent the loss in one path from affecting frames

in the other path, error propagation still exists within the same path. We note that

there is no reason to forbid one path from using another path’s frames as reference

if all the paths are good. Motivated by this observation, we propose to choose the

reference frames as follows: based on feedback and predicted path status, always

choose the last frame that is believed to have been correctly received as the reference

frame.

Specifically, we sent the coded frames on separate paths. The mapping of

frames to paths depends on the available bandwidth on each path. For example, in

the two-path case, if both paths have the same bandwidth, then even frames are sent

on path 1, and odd frames on path 2. We assume that a feedback message is sent

for each frame by the decoder. If any packet in a frame is lost, the decoder sends a

negative feedback (NACK) for that frame. Otherwise, it sends a positive feedback

(ACK). The feedback information for a frame may be sent on the same path as the

frame, or on a different path. An encoder receives the feedback message for frame

n−RTT when it is coding frame n, where the round-trip time (RTT) is measured in

frame intervals.

Furthermore, once a NACK is received for a frame delivered on one path, we

assume that the path remains “bad” until an ACK is received. Similarly, we assume

Page 36: C:/Documents and Settings/Shiwen Mao/My Documents/mydoc ...szm0001/ShiwenMao_dissertation.pdf · viii previous work, our scheme is more realistic, and is easier to implement. Depending

19

the path stays in the “good” status until a NACK is received. When encoding a new

frame, the encoder deduces the last correctly decoded frame, based on the feedback

messages received up to this time, and uses that frame as the reference frame. This

scheme works well when the loss of a path is bursty, which is typical in ad hoc

networks. A more sophisticated scheme may adopt a threshold for the NACKs and

ACKs for a given window of time and switch the reference frame only when the

threshold is exceeded.

Figure 2.1 is an example of the proposed RPS scheme where RTT = 3.

When NACK(1) is received at the time when frame 4 is being encoded, the encoder

knows that frames 2 and 3 cannot be decoded correctly due to error propagation.

Therefore, frame 0 is chosen as the reference for frame 4 and path 2 is set to “bad”

status. When frame 6 is coded, the encoder uses frame 4 instead of frame 5 as reference

frame, because path 2 is still in the “bad” status. When ACK(7) is received, path 2

is changed to “good” status. Frame 9 is then chosen as the reference of frame 10.

Path 1

Path 2

ACK(0)

NACK(1)

ACK(2)

NACK(3)

ACK(4)

NACK(5)

ACK(6)

ACK(7)

0

1

2

3

4 6 8 10

5 7 9

Figure 2.1: Illustration of the RPS scheme. The arrow associated with each frame

indicates the reference used in coding that frame.

The RPS scheme offers a good trade-off between coding efficiency and error

resilience. When both paths are good, RPS uses the immediate neighboring frame as

reference, thereby achieving the highest possible prediction gain and coding efficiency.

Page 37: C:/Documents and Settings/Shiwen Mao/My Documents/mydoc ...szm0001/ShiwenMao_dissertation.pdf · viii previous work, our scheme is more realistic, and is easier to implement. Depending

20

When one path is bad, the encoder avoids using any frames that are affected by

path errors, thereby minimizing the error propagation period. RPS has higher coding

efficiency than the schemes using fixed prediction distances [12], and error propagation

in each substream is effectively suppressed. These improvements are achieved by using

a frame buffer to store several previous coded frames as possible references, and by

using feedback for path status prediction. Note that in this scheme feedback is used

to control the operation of the encoder. No retransmission is invoked.

2.3.2 Layered Coding with Selective ARQ

This is a scheme using layered video coding. With this scheme, a raw video stream

is coded into two layers, a BL and an EL. We follow the SNR profile in the H.263

standard [48] when generating the layers. A BL frame is encoded using the standard

predictive video coding technique. Note that because the BL coding uses only the

previous BL picture for prediction, this coding method has a lower coding efficiency

than a standard single layer coder. This loss in coding efficiency is, however, justified

by increased error resilience: a lost EL packet will not affect the BL pictures. Good

quality is thus guaranteed if the BL packets are delivered error-free or at a very low

loss rate. There are three prediction options in the H.263 standard for enhancement

layer coding: UPWARD prediction, in which the base layer reconstruction of current

frame is used as the prediction of the enhancement layer, FORWARD prediction,

in which the enhancement layer reconstruction from the previous frame is used as

the prediction, and BI-DIRECTION prediction, in which the average of base layer

reconstruction and enhancement layer reconstruction is used. The LC with ARQ

codec selects from the three prediction options the one that has the best coding gain.

Although this approach is optimal in terms of coding efficiency for the enhancement

layer, error propagation can still occur in the EL pictures.

Given two paths, the traffic allocator sends the BL packets on one path

(the better path in terms of loss probability when the two paths are asymmetric)

and the EL packets on the other path. The receiver sends selective ARQ requests to

the sender to report BL packet losses. To increase the reliability of the feedback, a

Page 38: C:/Documents and Settings/Shiwen Mao/My Documents/mydoc ...szm0001/ShiwenMao_dissertation.pdf · viii previous work, our scheme is more realistic, and is easier to implement. Depending

21

copy of the ARQ request is sent on both paths. When the sender receives an ARQ

request, it retransmits the requested BL packet on the EL path, as illustrated in

Fig.2.2. The transmission bit rate for the EL may vary with the bit rate spent on BL

retransmission. For video streaming applications where video is typically pre-encoded

off-line, a simple rate control method is used for the EL path: when a BL packet is

retransmitted on the EL path, one or more EL packets are dropped to satisfy the

target transmission rate on the EL path.

BL packets EL packets

retransmitted BL packets

Path 2

Path 1

S D

Figure 2.2: A two-path layered video transmission model with end-to-end ARQ for

BL packets.

Our observations show that a multiple hop wireless path behaves more in

an on-off fashion with bursty packet losses. If there is a BL packet loss, the BL route

is most likely to be broken. Moreover, if the loss is caused by congestion at an inter-

mediate node, using the BL path for retransmission may make the congestion more

severe. If disjoint paths are used, path diversity implies that when the BL path is

down or congested, it is less likely that the EL path is also down or congested. There-

fore, retransmission using the EL path is likely to have a higher success probability

and lower delay.

As discussed in Section 1.2.2, we assume either the sender continuously

estimates the states of the paths based on received ARQ requests or QoS reports, or

the multipath routing process orders the paths according to their loss characteristics.

Page 39: C:/Documents and Settings/Shiwen Mao/My Documents/mydoc ...szm0001/ShiwenMao_dissertation.pdf · viii previous work, our scheme is more realistic, and is easier to implement. Depending

22

In the first case, a burst of ARQ requests received at the sender implies that the BL

path is in a “bad” state. If the inferred EL path state is better, the sender may switch

the paths. In the latter case, an intermediate node may send an error report back to

the source after it drops a packet (e.g., an Error Report in DSR [29], or an ICMP

Unreachable error message [47]). These error reports will trigger the routing process

to reorder the paths or initiate a new Route Query.

2.3.3 Multiple Description Motion Compensation

Unlike the above two techniques, MDMC is a multiple description coding scheme

which does not depend on the availability of feedback channels. Because paths in ad

hoc networks change between “up” and “down” state very often, each description ex-

periences bursty packet losses. Therefore, we employ the packet-loss mode of MDMC

presented in [9]. It uses a linear superposition of two predictions from two previously

coded frames. Fig. 2.3 illustrates the architecture of the MDMC encoder, where the

central prediction is obtained by

ψ(n) = a1ψe(n− 1) + (1 − a1)ψe(n− 2), (2.1)

where ψe(n−1) and ψe(n−2) are motion compensated predicted signals constructed

from two previously encoded frames ψe(n−1) and ψe(n−2) respectively. The central

prediction error e0(n) = ψ(n) − ψ(n) is quantized by quantizer Q0(·) to e0(n). The

quantized prediction error and motion vectors for even frames are sent on one path,

and those for odd frames are sent on another path. In the decoder, if frame n− 1 is

received, frame n is reconstructed using

ψd(n) = a1ψd(n− 1) + (1 − a1)ψd(n− 2) + e0(n). (2.2)

where ψd(n) represents motion compensated prediction from decoded frame n.

If frame n − 1 is damaged but frame n − 2 is received, the decoder only

uses the reconstructed frame n − 2 for prediction. To circumvent the mismatch

between the predicted frames used in the encoder and the decoder, the signal e1(n) =

ψe(n− 2)− a1ψe(n− 1)− (1− a1)ψe(n− 2)− e0(n) is quantized by another quantizer

Page 40: C:/Documents and Settings/Shiwen Mao/My Documents/mydoc ...szm0001/ShiwenMao_dissertation.pdf · viii previous work, our scheme is more realistic, and is easier to implement. Depending

23

Side Predictor

Central Predictor

Side Predictor

YUVVideo

Odd

yes

no

Path 0

Path 1

Figure 2.3: Illustration of the MDMC encoder.

Q1(·), which is typically coarser than Q0(·), and the output e1(n) is sent along with

other information on frame n. Now when frame n − 1 is damaged, the side decoder

reconstructs frame n using

ψd(n) = ψd(n− 2) + e0(n) + e1(n). (2.3)

In addition, the lost frame ψ(n− 1) is estimated using

ψd(n− 1) =1

a1

(ψd(n) − (1 − a1)ψd(n− 2) − e0(n)

). (2.4)

The MDMC codec offers a trade-off between redundancy and distortion over

a wide range by varying the coder parameters (the predictor coefficient a1 and the

quantization parameter of Q1(.)). The efficiency of a MDMC codec depends on the

selection of the parameters, which in turn depends on the estimation of the channel’s

error characteristics. There is only one additional buffer needed in MDMC compared

with conventional codecs that use only one previous frame for prediction.

2.3.4 Comparison and Discussion

The three schemes have their respective advantages and disadvantages. Depending on

the availability of a feedback channel, the delay constraint, and the error character-

istics of the established paths, one technique may be better suited for an application

than another. A comparison of these three schemes is given in Table 2.1.

Page 41: C:/Documents and Settings/Shiwen Mao/My Documents/mydoc ...szm0001/ShiwenMao_dissertation.pdf · viii previous work, our scheme is more realistic, and is easier to implement. Depending

24

RPS is applicable when feedback channels are available. The redundancy

depends on the distance between a current frame and its reference frame, which in

turn depends on the packet loss rate and the RTT . When the paths are error-free,

RPS has the highest encoding efficiency. Compared with ARQ-based schemes, there

is no decoding delay incurred but additional buffers are still needed.

LC with ARQ is suitable when feedback channels are available and the appli-

cation is such that the latency caused by retransmission is tolerable. The redundancy

of this scheme comes from the fact that a frame is predicted from the base-layer

reconstruction of the reference frame. It is difficult to control the amount of the

redundancy introduced, which is more than the amount of redundancy introduced in

the MDMC coder, when operating under the chosen set of parameters. This is why

the LC approach has the lowest quality when packet loss rate is low. However, when

the packet loss rate is high, this method can usually deliver the BL successfully, thus

providing better video quality than the other two proposed schemes, at the cost of

extra delay. The additional delay is at least RTT .

MDMC, unlike the other two, does not need feedback, nor does it incur

additional decoding delay. It is easier to control the redundancy in MDMC by chang-

ing the predictors and the side quantizer. The redundancy can be achieved in a

wider range than the above two schemes (even though the parameters of the MDMC

coder are fixed in all the simulation studies reported here). Since MDMC needs no

feedback, the video can be pre-encoded, which is desirable for video streaming ap-

plications. Note that this is not possible with the RPS scheme. The challenge with

MDMC is how to adapt the coding parameters based on the error characteristics of

the paths so that the added redundancy is appropriate.

2.4 Performance Study using Markov Models

In this section, we report on performance studies of the proposed schemes. The chal-

lenge is that the problem requires cross-layer treatment with a large set of parameters.

To simplify the problem and focus on the key issues, we first study the performance

of the schemes using Markov link models. The lower layer details are hidden and

Page 42: C:/Documents and Settings/Shiwen Mao/My Documents/mydoc ...szm0001/ShiwenMao_dissertation.pdf · viii previous work, our scheme is more realistic, and is easier to implement. Depending

25

Table 2.1: Comparison of the Three Schemes

- RPS LC with ARQ MDMC

Feedback Needed Yes Yes No

Decoding Delay No ≥ RTT No

Redundancy Error rates Error rates Encoding

Controlled by and RTT and BL quality parameters

Additional Buffer ≥ RTT frames ≥ RTT frames 1 frame

their impact on the video transport is embodied in the bursty errors generated by

the Markov models. We will examine the impact of lower layer components such as

user mobility, multipath routing, and the MAC layer using OPNET simulations in

Section 2.5.

2.4.1 The Video Codec Implementations and Parameters

We implemented in software the proposed three video coding schemes on top of the

public domain H.263+ codec [49]. In RPS, we added the reference picture selection

algorithm by extending the RPS option in the standard. In LC with ARQ, we added

a simple rate control algorithm for EL, as explained in section 2.3.2. In MDMC, the

codec was modified to produce both central and side predictions in the INTER mode,

and encodes central and side prediction errors using quantization parameters QP0

and QP1 respectively. More details about MDMC can be found in [9].

Error concealment is performed in the decoders when packets are lost. In

the RPS decoder, the conventional copy-from-previous-frame method is used. In the

LC decoder, if the BL is lost, the copy-from-previous-frame method is used. If the

EL is lost but the BL is received, the frame is reconstructed using the BL only. In

the MDMC decoder, the lost information can be recovered partially from the other

description received (see section 2.3.3 and [9]).

We use the Quarter Common Intermediate Format (QCIF) [176×144 Y

pixels/frame, 88×72 Cb/Cr pixels/frame] sequence “Foreman” (first 200 frames from

Page 43: C:/Documents and Settings/Shiwen Mao/My Documents/mydoc ...szm0001/ShiwenMao_dissertation.pdf · viii previous work, our scheme is more realistic, and is easier to implement. Depending

26

the original 30 fps sequence) encoded at 10 fps in the performance study of the

schemes. The encoder generates two substreams with a bit rate of 59Kbps each.

The TMN8.0 [49] rate control method is used in RPS and LC with ARQ, but the

frame layer rate control is disabled. In the simulations using Markov models, RTT is

assumed to be three frame intervals. In MDMC, a1 is set to 0.9, and the quantization

parameter (QP0,QP1) is fixed at (8,15) [9], which achieves approximately the same

bit rate as the other two schemes. The buffer size of RPS is set to 12 frames. If the

selected reference frame is not found in the buffer, the nearest frame is used instead.

In all the methods, 5% macroblock level intra-refreshments are used, which has been

found to be effective in suppressing error propagation for the range of the packet

loss rates considered. Each group of blocks (GOB) is packetized into a single packet,

to make each packet independently decodable. In LC with ARQ transmission, the

BL is transmitted on the better channel if the two channels are asymmetric. In the

following simulations, we allow a lost BL packet to be retransmitted only once.

2.4.2 Modeling of Ad Hoc Routes using Markov Models

In [32], replied routes for a route query are first broken into a pool of links and disjoint

routes are assembled from the links in the pool. Motivated by this work, we model

a multiple hop wireless route using the concatenation of a number of links, drawn

randomly from a link pool and each is modeled by a Markov chain [10].

This model has many advantages. The link pool can be easily built using

measurement data of ad hoc links. Furthermore, the loss pattern and loss rate are

easily controlled. More complex packet loss processes can also be modeled using semi-

Markov models. Performance analysis of the video codecs under a full spectrum of

loss processes is possible with this technique. The disadvantage of this model is that

it is a high level model. Details such as bandwidth and end-to-end delay variation,

interference, and user mobility cannot be modeled accurately. These issues will be

addressed in the OPNET models in the following sections.

Page 44: C:/Documents and Settings/Shiwen Mao/My Documents/mydoc ...szm0001/ShiwenMao_dissertation.pdf · viii previous work, our scheme is more realistic, and is easier to implement. Depending

27

2.4.3 Simulation Results using Markov Channel Models

A three-state Markov model was used for each link with the states representing a

“good”, “bad” or “down” status for the link. The “down” state means the link is

totally unavailable. The “good” state has a lower packet loss rate than the “bad”

state. The packet loss rates we used are p0 = 1.0, p1 ∈ [0.1%, 20%], and p2 = 0, for the

“down”, the “bad”, and the “good” states, respectively. The transition parameters

are chosen to generate loss traces with desired loss rates and mean burst lengths.

In our simulation, two paths were set up for each connection, and each path was

continuously updated as follows: After every two seconds, two links were chosen

randomly from a link pool to construct a new path. For the results reported in Fig.2.4-

Fig.2.7, the paths used are disjoint to each other. A video packet goes through a path

correctly only when it goes through every link successfully. For each case studied in

the following, a 400 second video sequence is used by repeatedly concatenating the

same video sequence 20 times. The error-free average PSNR of the received video

frames are 34.14dB, 33.31dB, and 33.47dB for RPS, LC with ARQ, and MDMC,

respectively.

The average PSNRs of decoded video sequences under various packet loss

rates are given in Fig.2.4-Fig.2.7. From this figure, we can conclude that the best

choice depends on the channel characteristics, including error rate and error pattern

(burst length), the application requirements, including delay constraint, and the dif-

ferences among those channel characteristics (for example, symmetric or asymmetric).

Specifically, the following observations can be made from the figures.

Effect of Packet Loss Rates

Generally, when the burst length is not higher than 9 packets, LC with ARQ has the

best performances when the error rate is medium to high. A large burst of error may

make the ARQ scheme less effective (see Fig.2.7). RPS has the best performance at

very low error rate due to its coding efficiency.

Page 45: C:/Documents and Settings/Shiwen Mao/My Documents/mydoc ...szm0001/ShiwenMao_dissertation.pdf · viii previous work, our scheme is more realistic, and is easier to implement. Depending

28

20

22

24

26

28

30

32

34

0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.1

Ave

rage

PS

NR

(dB

)

Loss rate for path 2 (Loss rate of path 1 is fixed at 12%)

LC with ARQ(mean burst len=4)RPS(mean burst len=4)

MDMC(mean burst len=4)LC(no retransmission,mean burst len=4)

Figure 2.4: Average PSNRs of the three schemes with asymmetric paths: Path 1’s

loss rate is fixed at 12%, and path 2’s loss rate varies from 0.1% to 10%.

Effect of Path Symmetry

The paths may be symmetric or asymmetric in terms of packet loss rates. Comparing

Fig.2.5 and Fig.2.6, when the average loss rate on the two paths is equal, LC without

ARQ works better in asymmetric channels, while the performance of RPS, LC with

ARQ, and MDMC is similar with either asymmetric or symmetric channels. For

example, for the [2%, 4%] point in Fig. 2.5, LC with ARQ has an average PSNR

of 32.84 dB, while for the [3%, 3%] point in Fig. 2.6, LC with ARQ has average

PSNR of 32.87 dB. This broadly holds for all other points in both figures. With

retransmission of lost BL packets on the EL path, the BL path does not need to be

significantly better than the EL path. The fact that the video quality is insensitive

to path symmetry with all three schemes is a blessing: This means that one does

not need to provide special provisioning in the network to provide at least one high

quality path.

Page 46: C:/Documents and Settings/Shiwen Mao/My Documents/mydoc ...szm0001/ShiwenMao_dissertation.pdf · viii previous work, our scheme is more realistic, and is easier to implement. Depending

29

20

22

24

26

28

30

32

34

0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.1

Ave

rage

PS

NR

(dB

)

Loss rate for path 2 (loss1=2xloss2)

LC with ARQ(mean burst len=4)RPS(mean burst len=4)

MDMC(mean burst len=4)LC(no retransmission,mean burst len=4)

Figure 2.5: Average PSNRs of the three schemes with asymmetric paths: Path 1’s

loss rate is twice of that of path 2.

Effect of Error Burst Lengths

From Fig.2.7, we observe that in the low to intermediate burst length range con-

sidered, the performances of all the three schemes improves gradually as the burst

length increases2. The reason is when the average packet loss rates are the same, a

shorter burst length means more frames have errors, which causes more distortions at

the video decoders; while a larger burst length means less frames have errors, which

could be remedied with effective error concealment. When the mean error burst

length increases, the RPS scheme gains most, since its path status prediction method

works better with longer bursts. Note that this trend is true only up to a certain

point. When the burst length increases beyond this point, so that all packets in two

or more consecutive frames are corrupted by an error burst, the PSNR of decoded

video will start to drop sharply.

2The burst length in Fig.2.4-Fig.2.7 is measured in packets. With the QCIF video, there are 9packets per frame.

Page 47: C:/Documents and Settings/Shiwen Mao/My Documents/mydoc ...szm0001/ShiwenMao_dissertation.pdf · viii previous work, our scheme is more realistic, and is easier to implement. Depending

30

20

25

30

35

40

0 0.02 0.04 0.06 0.08 0.1 0.12 0.14 0.16 0.18 0.2

Ave

rage

PS

NR

(dB

)

Loss rate for path 1 and path 2

LC with ARQ (mean burst len=4)RPS (mean burst len=4)

MDMC (mean burst len=4)LC(mean burst len=4,no retransmission)

Figure 2.6: Average PSNRs of the three schemes with symmetric paths: The mean

burst length is fixed at 4 packets, while the loss rates varies.

Effect of Delay Constraint

When retransmission is allowed by the end-to-end delay constraint, LC with ARQ

has the best performance among these three schemes, even only one retransmission

is allowed. We also show in Fig.2.4-Fig.2.7 the results for the LC scheme without

retransmissions. This test is used to emulate the case when retransmission is not

possible, because the delay constraint of the underlying application does not allow to

do so, because it is not feasible to set up a feedback channel, or because end-to-end

retransmission is not practical (e.g., video multicast). Without the ARQ protection

of BL, the performance of LC is, as expected, the poorest among all the schemes. It

shows that LC is effective only when the transport layer can provide efficient error

control (e.g., FEC or ARQ) for the BL packets. Although LC with ARQ has the

highest PSNR in the cases we studied, it is the most susceptible to a large end-to-end

delay and imperfect feedback channels.

Note that the performance of the three schemes is also influenced by several

other factors. For example, the performance of RPS varies with the RTT as well: the

Page 48: C:/Documents and Settings/Shiwen Mao/My Documents/mydoc ...szm0001/ShiwenMao_dissertation.pdf · viii previous work, our scheme is more realistic, and is easier to implement. Depending

31

20

25

30

35

40

0 2 4 6 8 10 12 14 16 18 20

Ave

rage

PS

NR

(dB

)

Mean loss burst length for path 1 and path 2

LC with ARQ (mean loss rate=10%)RPS (mean loss rate=10%)

MDMC (mean loss rate=10%)LC(mean loss rate=10%,no retransmission)

Figure 2.7: Average PSNRs of the three schemes with symmetric paths: The loss

rates are fixed at 10%, while the mean burst length varies.

shorter the RTT is, the better RPS works. This is because when the RTT is shorter,

the RPS encoder will be notified of the corrupted frames earlier and then it can stop

error propagation earlier. Because MDMC does not require the set up of a feedback

channel, it can be used for a wider range of applications.

2.5 Performance Study using OPNET Models

The simplicity of the Markov model enables us to examine the performance of the

proposed schemes over a wide range of packet loss patterns. In this section, we use the

OPNET models to examine the impact of lower layer factors, which are not revealed

by the Markov model. Using MDMC as an example, we show how multipath routing,

MAC operation, and user mobility affect video transport.

Page 49: C:/Documents and Settings/Shiwen Mao/My Documents/mydoc ...szm0001/ShiwenMao_dissertation.pdf · viii previous work, our scheme is more realistic, and is easier to implement. Depending

32

2.5.1 Multipath Routing using Dynamic Source Routing

DSR is a source routing protocol proposed for mobile ad hoc networks, where inter-

mediate nodes do not need to maintain up-to-date routing information for forwarding

a transit packet since the packet carries the end-to-end path in its header. It is an

on-demand protocol where route discovery is performed for a node only when there

is data to be sent to that node [29].

There are a number of extensions of DSR to multipath routing. In [23],

intermediate nodes are forbidden to reply to route queries. The destination node

receives several copies of the same route query (each traverses a possibly different

path) within a time window. At the end of the time window, the path with the

shortest delay, and another path which is most disjoint with the shortest path are

returned to the source. In [50], in addition to a shortest path, backup paths from

the source node and from each intermediate node of the shortest path are found to

reduce the frequency of route request flooding.

We extended DSR to multipath DSR (MDSR) in the following way. Each

node maintains two routes to a destination. We allow both the destination node and

intermediate nodes to reply to a route query. When the destination node replies, it

also copies the existing routes from its own route cache into the route reply, in addition

to the route that the route query traversed. Another difference with the single path

DSR is that when a node overhears a reply which carries a shorter route than the

reply it plans to send, it still send its reply to the requesting source. This increases the

number of different routes returned, giving the source a better choice from which to

select two maximally disjoint paths from these replies. The first returned route is used

by the originating packet. Then the route cache is further updated (or optimized) as

new replies for the same query arrive, using the algorithm in Fig.2.8. This algorithm

is a greedy algorithm in the sense that it always finds the best paths returned so far.

Since with DSR, a reply with shorter route usually arrives earlier [29], this heuristic

algorithm gives good performance without using a time window. As compared to

[23], the routing delay with MDSR is smaller.

Although MDSR is capable of maintaining more than two routes, we only

Page 50: C:/Documents and Settings/Shiwen Mao/My Documents/mydoc ...szm0001/ShiwenMao_dissertation.pdf · viii previous work, our scheme is more realistic, and is easier to implement. Depending

33

experimented with the two-path version, since the results in [50] indicate that the

largest improvement is achieved by going from one to two or three paths. The MDSR

models is built based on the OPNET DSR model[51].

NOTE: W is the returned route, R0 and R1 are cached routes. RAND(R0,R1) returns R0 or R1 with equal probability

W=getRouteFromReply(Route Reply);if( (R0 is empty) and (R1 is empty) ) { copy W to RAND(R0,R1);}else if ( (R0 is not empty) and (R1 is empty) ) { copy W to R1;}else if ( (R0 is empty) and (R1 is not empty) ) { copy W to R0;}else { if(R0,R1,W have the same number of common nodes) { (R0,R1)=getTheTwoShortestRoutes(R0,R1,W); } else { (R0,R1)=getTheTwoMostDisjointRoutes(R0,R1,W); }}

Figure 2.8: The MDSR route updating algorithm.

2.5.2 OPNET Simulation Setting

Using the OPNET model, we simulate an ad hoc network with 16 nodes in a 600m

by 600m region. Given the dimensions of the region, 16 nodes result in a density

that maintains a connected network for most of the time [52]. Each node is randomly

placed in the region initially. We used a version of the popular Random Waypoint

mobility model, where each node first chooses a random destination in the region,

then moves toward it at a constant speed. When it reaches the destination, it pauses

for a constant time interval, chooses another destination randomly, and then moves

towards the new destination [53]. Note that this is a simplified version of the Random

Page 51: C:/Documents and Settings/Shiwen Mao/My Documents/mydoc ...szm0001/ShiwenMao_dissertation.pdf · viii previous work, our scheme is more realistic, and is easier to implement. Depending

34

Waypoint model. Since there is no randomness in the nodal speed, the convergence

problem reported in [54] does not present itself here. We used a pause time of 1.0

second for all the experiments reported in this chapter. The speed of the nodes varies

from 0m/s to 10m/s, which models movement of pedestrians or vehicles in city streets.

We use the IEEE 802.11 protocol in the MAC layer working in the DCF

mode. Its physical layer features, e.g., frequency hopping (FH), are not modeled.

The channel has a bandwidth of 1Mb/s. The transmission range is 250 meters. If the

sender of a packet is within this range of the receiver, and the sender has successfully

accessed the channel during the transmission period, the packet is regarded as cor-

rectly received. The maximum number of link layer retransmissions is 7, after which

the packet is dropped. UDP is used in the transport layer. We also implemented part

of the RTP functionality [14], such as timestamping and sequence numbering in our

model.

Among the 16 nodes, one is randomly chosen as the video source and another

node is chosen as the video sink, where a 5 second playout buffer is used to absorb the

jitter in received packets. The video source starts a session using two routes, sending

two substreams of encoded video (59kbps each) to the sink. All other nodes generate

background traffic to send to a randomly chosen destination. The inter-arrival time

of the background packets is exponentially distributed with a mean of 0.2 second.

The background packet has a constant length of 512 bits.

2.5.3 Simulation Results using OPNET Models

MDSR Performance

First we examined the performance of MDSR. Fig.2.9 plots the traces of two routes

maintained by the video source node during a simulation in which each node has a

speed of 10m/s. The length of a route is denoted by the total number of nodes the

route traverses, including the source and the destination. Each point in the figure also

means a route update: either a better route is found, or a route in use is broken. It

can be seen that the routes are unstable. However, since the nodes are moving around

rapidly, new neighbors are found and a new route is discovered shortly after the old

Page 52: C:/Documents and Settings/Shiwen Mao/My Documents/mydoc ...szm0001/ShiwenMao_dissertation.pdf · viii previous work, our scheme is more realistic, and is easier to implement. Depending

35

route is down. This shows that mobility is both harmful and helpful. The lengths of

the routes vary from 0 to 6 (5 hops, a little higher than the diameter of the network)

during the simulation. For most of the simulation period, the route length is 2, which

means direct communication between the source and the destination, or 3, which

means a relay node is used in between. We also plot the number of common nodes

between two paths. During most of the simulation period, this number is 2, which

means the two routes are disjoint except for the common source and destination.

Fig.2.10 shows the period when the routes are the longest and most cor-

related. At the beginning of the period, the two paths are two hops each and are

disjoint. Then they get longer and have more common nodes, probably indicating

that they are moving away from each other. After the 260th second, they become

two hops again after shorter new paths are found. After the 262th second, the routes

become disjoint again.

0

1

2

3

4

5

6

7

8

0 50 100 150 200 250 300 350

Num

ber

of N

odes

time (second)

length of route 0length of route 1

number of common nodes

Figure 2.9: Simulation results of 16 nodes moving in a 600m×600m region at a speed

of 10m/s. Plotted are the traces of two routes to the video sink maintained by the

video source during the simulation.

Page 53: C:/Documents and Settings/Shiwen Mao/My Documents/mydoc ...szm0001/ShiwenMao_dissertation.pdf · viii previous work, our scheme is more realistic, and is easier to implement. Depending

36

0

1

2

3

4

5

6

7

8

235 240 245 250 255 260 265 270

Num

ber

of N

odes

time (second)

length of route 0length of route 1

number of common nodes

Figure 2.10: A zoom-in plot of Fig.2.9.

MPT vs. SPT

Next we compare the performance of MPT with single path transport (SPT) in

Fig.2.11 and Fig.2.12, where the same multistream video coder, MDMC, is used.

For SPT, we used the NIST DSR model [51] which maintains a single path to a

destination, while for MPT we used the MDSR model which is a multipath routing

extension of [51]. We transmit both substreams on the same path in the SPT simu-

lations. To alleviate the impact of bursty errors, we interleave the packets of the two

descriptions with a two-frame interleaving interval. Description 1 (2) packets for two

even (odd) frames is followed by description 2 (1) packets for two odd (even) frames.

We perform this experiment for the 16-node network where each nodes moves at a

speed of 10m/s. The PSNR traces (using the left y axis) and loss traces (using the

right y axis) using MDSR and DSR are plotted in Fig.2.11 and Fig.2.12, respectively.

It can be seen that PSNR drops when there is loss in either substream. Also the

deepest drop occurs when a large burst of loss of one substream overlaps with that of

the other substream. SPT has higher loss rates than MPT, and therefore the PSNR

curve in Fig.2.12 has more frequent and severe drops than that in Fig.2.11. It is

Page 54: C:/Documents and Settings/Shiwen Mao/My Documents/mydoc ...szm0001/ShiwenMao_dissertation.pdf · viii previous work, our scheme is more realistic, and is easier to implement. Depending

37

0

5

10

15

20

25

30

35

40

45

50

0 500 1000 1500 2000 2500 30000

5

10

15

20

25

30

35

40

PS

NR

(dB

)

Lost

Pkt

s/F

ram

e

Frame Number

PSNR(MDMC,10m/s)Lost Pkts/Frame(substream 0,10m/s)Lost Pkts/Frame(substream 1,10m/s)

Figure 2.11: The PSNRs of the received frames with a MDMC codec using two paths.

16 nodes move in a 600m× 600m region at a speed of 10m/s. Plotted on the right y

axis are the lost packets per frame. The MSDR algorithm is used for route updates.

The measured average loss rates of the two substreams are: (3.0%, 3.1%).

obvious that SPT has poorer performance than MPT.

This is further illustrated in Fig.2.13, which displays the zoomed versions of

the loss traces of both simulations. The SPT traces (Fig. 2.13(b)) have more frequent

packet losses than the MPT traces (Fig. 2.13(a)). Even worse is that the packet losses

of the substreams are strongly correlated in the SPT case. This has the most negative

impact on the MDMC performance. Although we interleaved the substreams before

transmitting, the burst length is too long, rendering the interleaving ineffective. A

larger interleaving interval may help but at the cost of a larger end-to-end delay.

Impact of Mobility

We also examined the impact of mobility on video transport using MDMC as an

example. Fig.2.14 shows the PSNRs of the received video frames when the nodes

are stationary. The PSNR curve in Fig.2.14 is very stable, with only a few narrow

Page 55: C:/Documents and Settings/Shiwen Mao/My Documents/mydoc ...szm0001/ShiwenMao_dissertation.pdf · viii previous work, our scheme is more realistic, and is easier to implement. Depending

38

0

5

10

15

20

25

30

35

40

45

50

0 500 1000 1500 2000 2500 30000

5

10

15

20

25

30

35

40

PS

NR

(dB

)

Lost

Pkt

s/F

ram

e

Frame Number

PSNR(MDMC,10m/s)Lost Pkts/Frame(substream 0,10m/s)Lost Pkts/Frame(substream 1,10m/s)

Figure 2.12: The PSNRs of the received frames with a MDMC codec using a single

path. 16 nodes move in a 600m× 600m region at a speed of 10m/s. Plotted on the

right y axis are the lost packets per frame. The path is updated using the the NIST

DSR model and both substreams are sent on the path using an interleaving interval

of 2 frames. The measured average loss rates of the two substreams are: (6.3%,6.4%).

drops. From the error trace below, we can see that the losses are mostly random, i.e.,

the error burst length is 1 packet for most of the time. When nodes begin to move

around at a speed of 10m/s, the PSNR curve in Fig.2.11 is much worse with many

more drops. The largest drops occur at the 500th, 2000th and 2500th frames3. We

conjecture that during these periods the source node and destination node were either

far away from each other, or were in a hot-spot. We can see that in both figures,

the valleys of the PSNR curve match the loss bursts drawn below. In Fig.2.11, the

loss bursts in this plot match the two routes’ longest and most correlated periods in

Fig.2.10. Mobility clearly has a negative effect on video transport.

Figure 2.15 shows the mean packet loss rates and the mean packet loss burst

3Note that the frame rate is 10 frames/second. The frame number corresponding to the 250thsecond in Fig.2.10 is 2500 in Fig.2.11.

Page 56: C:/Documents and Settings/Shiwen Mao/My Documents/mydoc ...szm0001/ShiwenMao_dissertation.pdf · viii previous work, our scheme is more realistic, and is easier to implement. Depending

39

0

2

4

6

8

10

12

100 200 300 400 500 600 700 800

Lost

Pkt

s/F

ram

e

Frame Number

Substream 0Substream 1

(a)

0

2

4

6

8

10

12

100 200 300 400 500 600 700 800

Lost

Pkt

s/F

ram

e

Frame Number

Substream 0Substream 1

(b)

Figure 2.13: Zoomed lost packet per frame traces of two substreams: (a) The MPT

case (corrsponding to Fig.2.11); (b) The SPT case (corresponding to Fig.2.12).

lengths of the two substreams during the simulations when the mobile speed varies

from 0m/s to 10m/s. Fig.2.16 is the resulting average PSNR for different speeds using

the MDMC codec. There are several interesting observations:

1. The two routes maintained by MDSR are relatively symmetric in their error

characteristics. Recall from Fig.2.8 that the algorithm does order the two paths

MDSR maintains. This is suitable for MDMC since the two descriptions are

equivalent in importance on video quality. If LC with ARQ is used, an algorithm

that always puts the shortest path in cache 0 (which is used by the BL) is

preferred.

Page 57: C:/Documents and Settings/Shiwen Mao/My Documents/mydoc ...szm0001/ShiwenMao_dissertation.pdf · viii previous work, our scheme is more realistic, and is easier to implement. Depending

40

0

5

10

15

20

25

30

35

40

45

50

0 500 1000 1500 2000 2500 30000

1

2

3

4

5

PS

NR

(dB

)

Lost

Pkt

s/F

ram

e

Frame Number

PSNR(MDMC,0m/s)Lost Pkts/Frame(substream 0,0m/s)Lost Pkts/Frame(substream 1,0m/s)

Figure 2.14: The PSNRs of the received frames with a MDMC codec. 16 nodes in a

600m× 600m region. Plotted on the right y axis are the lost packets per frame. The

nodes are stationary.

2. MDSR effectively reduces both the mean packet loss rate and the mean packet

loss burst length. When the speed is 10m/s, the average loss rate for MDSR is

about 3%, while that of DSR is about 6.4%. The mean burst length of MDSR

at 10m/s is about 11, while that of DSR is about 20.

3. When the nodes are stationary, the mean burst length of MDSR is 1. Packet loss

in this case is mainly caused by failure in accessing the channel. When nodes

are mobile, the links are more frequently broken because of nodal mobility, and

the loss characteristics change from random loss to bursty loss. We conjecture

that the burst length depends on the time scale of the routing protocol and the

rate of change in the topology.

4. Somewhat counter-intuitive is that as speed increases, the average loss rate

becomes stable. Furthermore, the highest mean burst length occurs at 4m/s

instead of 10m/s. The mean PSNR in Fig.2.16 shows the same trend. When

Page 58: C:/Documents and Settings/Shiwen Mao/My Documents/mydoc ...szm0001/ShiwenMao_dissertation.pdf · viii previous work, our scheme is more realistic, and is easier to implement. Depending

41

0123456789

10

0 2 4 6 8 10 12

Ave

rage

Pac

ket L

oss

Rat

e (%

)

Speed (meters per second)

Substream 0 (MPT)Substream 1 (MPT)Substream 0 (SPT)Substream 1 (SPT)

(a)

0

5

10

15

20

25

30

35

40

45

0 2 4 6 8 10 12

Mea

n B

urst

Len

gth

(pac

kets

)

Speed (meters per second)

Substream 0 (MPT)Substream 1 (MPT)Substream 0 (SPT)Substream 1 (SPT)

(b)

Figure 2.15: Loss characteristics vs. mobile speed for both MPT and SPT OPNET

simlations: (a) Average packet loss rate; (b) Average error burst length.

speed increases, the mean PSNR first drops, then becomes stable. Note that

at 4m/s, the mean burst length is about 40 packets, which correspond to more

than 4 frames. At this high burst length (and the corresponding high loss

rate), the general trend that the PSNR increases with the burst length is not

true any more. In fact, the likelihood that both paths experience packet losses

simultaneously is quite high, which leads to a significant drop in the decoded

video quality.

These observations further verify our previous observation that mobility is both harm-

ful and helpful. During the initial increase in mobility, routes break down more fre-

Page 59: C:/Documents and Settings/Shiwen Mao/My Documents/mydoc ...szm0001/ShiwenMao_dissertation.pdf · viii previous work, our scheme is more realistic, and is easier to implement. Depending

42

28

29

30

31

32

33

34

35

36

0 2 4 6 8 10

Ave

rage

PS

NR

(dB

)

Speed (meters per second)

MDMC

Figure 2.16: The average PSNR vs. node speed for the MDMC scheme from the

OPNET simulations.

quently, which leads to an increase in the mean packet loss rate and a drop in the

PSNR. As speed further increases, new topologies are more quickly formed and new

routes are more quickly established. A hot spot in the region, where nodes cluster and

compete for the channel, is more quickly dispersed. As speed increases, the period

of time a node remains disconnected is smaller. The turning point (4m/s in Fig.2.15

and Fig.2.16) is determined by the node density in the region and the transmission

range. We conjecture that similar phenomenon exists for other scenarios with a dif-

ferent number of nodes or a different transmission range, given that the node density

is high enough to maintain a connected network for most of the time. When the

nodal speed increases even further, the routing process would be unable to track the

quickly changing topology. Therefore, drops in the average PSNR is expected.

Page 60: C:/Documents and Settings/Shiwen Mao/My Documents/mydoc ...szm0001/ShiwenMao_dissertation.pdf · viii previous work, our scheme is more realistic, and is easier to implement. Depending

43

2.6 An Ad Hoc Multipath Video Testbed

To validate the feasibility of video transport over ad hoc networks and evaluate the

achievable video quality with today’s technology, we implemented an ad hoc multipath

video streaming testbed. The implementation and experimental results are reported

in the following.

2.6.1 The Setup of the Testbed

The testbed is implemented using four IBM Thinkpad notebooks with IEEE 802.11b

cards. Fig.2.17 shows the network view of the testbed. The notebooks were placed

at (or moved around) various locations in the Library/CATT building at Polytechnic

University. IBM High Rate Wireless LAN cards are used working in the DCF mode

with a channel bandwidth of 11Mbps. The corridors are about 30m×60m. In the

building, there is interference from IEEE 802.11 access points (AP) and other elec-

tronic devices (e.g., microwave ovens). Nodes S and D are, respectively, the video

source and sink, while nodes R1 and R2 are the relays. Since there are only four nodes

in this network, we use static routing to force the use of two-hop routes. Dynamic

routing will be implemented in a future version.

S

D

R1

R2

S

D

R1

R2

(a) (b)

Figure 2.17: Experiment scenarios for the testbed: (a) Line-of-sight; (b) Behind the

walls.

Page 61: C:/Documents and Settings/Shiwen Mao/My Documents/mydoc ...szm0001/ShiwenMao_dissertation.pdf · viii previous work, our scheme is more realistic, and is easier to implement. Depending

44

The system is built on Microsoft Windows 2000. We implemented the times-

tamping, sequence numbering, and QoS feedback functions in the application layer.

For LC with ARQ, limited retransmission for BL is implemented in the application

layer as well. UDP sockets are used at the transport layer. A traffic allocator dy-

namically allocates packets to the two paths. The video sink maintains a playout

buffer, using a hash table data structure. Typically video streaming applications use

a playout buffer of a few seconds to smooth the jitter in incoming packets. We chose

a playout buffer of 2 seconds for this network. To support interactive applications,

we also experimented with a 300ms playout delay.

The implementations of LC with ARQ and MDMC codec discussed in Sec-

tion 2.4.1 are used. We did not use the RPS codec since it does not support streaming

of pre-encoded video. For both schemes used, the testbed performs off-line encoding,

but the received video frames are decoded in real time and displayed on the screen of

node D.

2.6.2 Experimental Results

We examined the performance of the system in the scenarios shown in Fig.2.17.

The average PSNRs of the received frames using the two schemes are presented in

Table 2.2 and Table 2.3, respectively. For comparison with the Markov simulation

studies presented in Section 2.4, the last row of each Table lists the average PSNRs

obtained from Fig.2.4-Fig.2.7. Each testbed value in the tables is the average over an

experiment lasting for 10 to 15 minutes. In all the tests using LC with ARQ, path 1

is used for the BL.

The results in Table 2.2 are consistent with the Markov simulation results

presented in Section 2.4.3. The minor difference between the last two rows of Table

2.2 are caused by the differences in the actual loss rates of the testbed experiments

and the corresponding Markov simulations, and the differences in the loss patterns

(the experimental loss processes may not be Markovian). An interesting observation

is that for the test scenario in Fig.2.17(a), when the loss rates for both substreams are

very low, MDMC has a higher average PSNR (33.11 dB) than LC with ARQ (32.24

Page 62: C:/Documents and Settings/Shiwen Mao/My Documents/mydoc ...szm0001/ShiwenMao_dissertation.pdf · viii previous work, our scheme is more realistic, and is easier to implement. Depending

45

Table 2.2: Average PSNRs of Decoded Frames: MDMC Testbed Experiments and

Markov Simulations

Scenario Fig.2.17(a) Fig.2.17(b) Fig.2.17(b) Fig.2.17(b)

Playout delay 2s 2s 2s 300ms

Pkt loss rate 1 0.41% 6.14% 8.46% 8.13%

Pkt loss rate 2 0.75% 11.96% 7.52% 7.97%

Burst length 1 1.75 3.79 3.67 6.08

Burst length 2 4.76 3.08 3.33 2.34

Ave. PSNR (testbed) 33.11 dB 27.53 dB 28.65 dB 28.16 dB

Ave. PSNR (sim) 33.08 dB 28.05 dB 28.65 dB 28.65 dB

dB). This is also shown in Fig.2.5 and Fig.2.6.

The results in Table 2.3 clearly show that ARQ effectively reduces the BL

packet loss rate in all the experiments. These reductions account for improved video

quality. For example, all lost BL packets in the test of Fig.2.17(a) were successfully

recovered, resulting in a BL loss rate of 0%. The average PSNR for this test is

the highest among all the LC with ARQ experiments. However, this is not true

for the mean burst length. Since each BL packet has its deadline imposed by the

playout delay, ARQ is more effective in recovering short error bursts. During an

experiment, many short error bursts are recovered, reducing the number of error

bursts. But for an error burst comparable or greater than the playout delay, only

a portion of it is successfully retransmitted. Therefore, sometimes the mean burst

length of BL increases when ARQ is used. During the experiments with LC with

ARQ, we observed short periods of badly corrupted frames, followed by a long period

of high quality frames. Recall in Fig.2.7, we showed that given the same average

loss rate, PSNR improves when the mean burst length increases. The increased

mean burst length, combined with reduced average BL loss rate, contributes to the

increased video quality. The ARQ success ratio in the table is defined to be the ratio

of the number of successfully retransmitted BL packets and the number of all lost

BL packets. This ratio decreases as the loss rates of both paths increase, and as the

Page 63: C:/Documents and Settings/Shiwen Mao/My Documents/mydoc ...szm0001/ShiwenMao_dissertation.pdf · viii previous work, our scheme is more realistic, and is easier to implement. Depending

46

Table 2.3: Average PSNRs of Decoded Frames: LC with ARQ Testbed Experiments

and Markov Simulations

Scenario Fig.2.17(a) Fig.2.17(b) Fig.2.17(b) Fig.2.17(b)

Playout delay 2s 2s 2s 300ms

Ori. BL loss 0.06% 5.95% 7.98% 7.94%

BL loss 0.00% 2.49% 2.25% 5.37%

EL loss 0.38% 12.22% 8.14% 8.16%

Ori. BL burst len. 3.64 4.80 3.94 4.25

BL burst len. 0 10.23 6.33 8.58

EL burst len. 3.29 4.18 3.94 3.16

ARQ succ. ratio 100% 58.0% 71.8% 32.4%

Ave. PSNR (testbed) 32.34 dB 30.64 dB 30.14 dB 30.13 dB

Ave. PSNR (sim) N/A 31.10 dB 31.68 dB 31.68 dB

playout delay decreases.

For the experiments reported here, the packet delay and delay jitter are

both very low, because there is low background traffic and the bit rates of the video

substreams are also low. For this reason, very few packets are droped because of

lateness even with a 300 ms playout delay, and the video equality obtained with a

300 ms playout delay is similar to that with a 2s playout delay, with both schemes.

When the system load is higher, the 300ms playout delay experiment is likely to yield

worse performance.

In the experiments, LC with ARQ performs better than MDMC, except for

the very low loss cases, which is consistent with the Markov model simulation results in

Fig.2.4-Fig.2.7. However, as compared to the results in Fig.2.4-Fig.2.7, LC with ARQ

yields lower average PSNR under similar loss rates. In fact, a successful retransmission

requires (1) successful and timely delivery of the NACK, and (2) successful delivery

of the retransmitted BL packet before its deadline. Recall that in the Markov model

simulations, we assume a perfect feedback channel and the paths were chosen to be

disjoint. These assumptions ensure a high ARQ success ratio. Additionally, in the

Page 64: C:/Documents and Settings/Shiwen Mao/My Documents/mydoc ...szm0001/ShiwenMao_dissertation.pdf · viii previous work, our scheme is more realistic, and is easier to implement. Depending

47

testbed, NACKs are sent on the same two paths as the video packets. The end-to-

end delay is not fixed as in the Markov model simulations. These further reduce the

degree of path diversity and the effectiveness of the ARQ algorithm. For MDMC,

no feedback is necessary. As a result, its performance is more consistent with the

Markov simulation results in Fig.2.4-Fig.2.7.

Figure 2.18 is a screenshot of the MDMC testbed during an experiment. The

upper left part of the GUI displays the received video and the network view of the

testbed. The transport related statistics (loss rates, jitter, receiver buffer occupancy,

etc.) and the video codec related attributes (frame rate, format, bit rates, etc.) are

displayed at the upper right part. The two windows in the center display the packet

loss traces of the two substreams for each frame. The lower part is the PSNR trace

of the received video, which illustrates how video quality is impaired by packet losses

of both substreams, and how the MDMC decoder recovers from the packet losses.

2.7 Summary

Enabling video transport over ad hoc networks is more challenging than over other

wireless networks, both because ad hoc paths are highly unstable and compressed

video is susceptible to transmission errors. However, multiple paths in an ad hoc net-

work can be exploited as an effective means to combat transmission errors. Motivated

by this observation, we chose three representative video coding schemes, all based on

MCP used in modern video coding standards, and show how to adapt these schemes

with MPT to enable video transport over ad hoc networks.

In this chapter, we highlight the close interaction between the video codec

and the transport layer. Results presented suggest that if a feedback channel can be

set up, the standard H.263 coder with its RPS option can work quite well. Addition-

ally, if delay caused by one retransmission is acceptable, layered coding with the BL

protected by ARQ is more suitable. MDC is the best choice when feedback channels

cannot be set up, or when the loss rates on the paths are not too high. A comparison

of the schemes from the perspective of video coding, e.g., frame memory required

at the coder and decoder, source coding redundancy, and on-line/off-line coding, is

Page 65: C:/Documents and Settings/Shiwen Mao/My Documents/mydoc ...szm0001/ShiwenMao_dissertation.pdf · viii previous work, our scheme is more realistic, and is easier to implement. Depending

48

Figure 2.18: A screenshot of the testbed GUI during a MDMC experiment.

presented as well. Using OPNET models, we extended DSR to support multipath

routing. The impact of multipath routing and mobility is investigated.

To further verify the feasibility of video transport over ad hoc networks,

we implemented an ad hoc multipath video streaming testbed using four notebook

computers. Our tests show that acceptable quality streaming video is achievable with

both LC with ARQ and MDMC, in the range of video bit rate, background traffic,

and motion speed examined. Together with simulation results using the Markov

path model and the OPNET models, our studies demonstrate the viability of video

transport over ad hoc networks using multipath transport and multistream coding.

Page 66: C:/Documents and Settings/Shiwen Mao/My Documents/mydoc ...szm0001/ShiwenMao_dissertation.pdf · viii previous work, our scheme is more realistic, and is easier to implement. Depending

49

Chapter 3

Optimal Traffic Partitioning

3.1 Introduction

In Chapter 2, we demonstrated how to support video transport using multiple paths

under the setting of a wireless mobile ad hoc network. Among the four basic com-

ponents in the architecture shown in Fig. 1.1, we have examined multistream video

coding and multipath routing. In this chapter, we present an analysis of the remain-

ing two important components in the architecture, i.e., the traffic allocator and the

resequencing buffer.

On the sender side, the traffic allocator is responsible for partitioning the

application data, i.e., dispatching application data packets to the multiple paths

maintained by an underlying multipath routing protocol. Using the traffic flow(s)

from the application, a set of paths to use, and the QoS parameters associated with

each path as inputs, the traffic allocator should decide which path to assign the next

packet to. The traffic partitioning strategy is affected by a number of factors, such as

the auto-correlation structure of the application data flow, the QoS parameters of the

paths (e.g., bandwidth, delay, and loss characteristics of each path), and the number

of available paths. Usually the path parameters can be inferred from feedbacks, so

that the traffic allocator can adjust its strategy to adapt to the changes in the network.

On the receiver side, incoming packets are put in a resequencing buffer (or

a playout buffer in realtime streaming applications) in order to restore their original

order. Packets may arrive at the receiver out-of-order because the paths they traverse

Page 67: C:/Documents and Settings/Shiwen Mao/My Documents/mydoc ...szm0001/ShiwenMao_dissertation.pdf · viii previous work, our scheme is more realistic, and is easier to implement. Depending

50

may have different delays when MPT is used. However, even when a single path is

used, packets may still be out of order since when link failure occurs, packets will

be routed to a new path with a different delay. In reliable tranport protocols for

data traffic (e.g., TCP), a packet may stay in the resequencing buffer for a long time

waiting for a missing packet with a smaller sequence number. In realtime multime-

dia applications, the resequencing buffer is mainly used to absorb jitter in arriving

packets. Since the receiver displays the video or audio continuously, each packet is

associated with a deadline Dl, which is the difference between the time when it is

extracted from the resequencing buffer to be decoded and played out, and the time

when it is transmitted by the sender. In such applications, a packet will only stay

in the resequencing buffer for at most Dl seconds, and a packet may be lost either

because of transmission errors (e.g., buffer overflow or link failures) or because it ex-

periences a delay larger than the deadline. If there are some packets missing when

a frame is being decoded, the decoder may apply error concealment to reduce the

damage in the video quality caused by the lost packets.

The traffic partitioning and the resequencing buffer design are closely related

to each other. Given a set of paths, these two components determine the total end-

to-end delay of the session. Although for realtime multimedia services, the ultimate

goal is to minimize the distortion of received multimedia data, we focus on minimizing

the end-to-end delay in this chapter. This requires traffic partitioning be adaptive

to the path parameters. In the RPS and MDMC schemes examined in Chapter 2,

we applied a simple frame level, round-robin type scheme in traffic allocation, with a

granularity of a video frame.

In this chapter, we investigate the optimal traffic partitioning problem using

the network calculus theory in a deterministic setting. We model the bottleneck link

of each path as a queue with a deterministic service rate. The contribution of all other

links and the propagation delay are lumped into a fixed delay element. Moreover, we

assume the source flow is regulated by a {σ, ρ} leaky bucket, and then use determin-

istic traffic partitioning to split the traffic into multiple flows, each conforming to a

{σi, ρi} regulator. Under these assumptions, we formulate a constrained optimization

problem on minimizing the total end-to-end delay. The simplicity of the model re-

Page 68: C:/Documents and Settings/Shiwen Mao/My Documents/mydoc ...szm0001/ShiwenMao_dissertation.pdf · viii previous work, our scheme is more realistic, and is easier to implement. Depending

51

sults in a neat formulation. We derive the closed-form solution which provides simple

guidelines on minimizing end-to-end delay and path selection. The path set chosen

using our analysis is optimal in the sense that it is the minimum set of paths required

to achieve the minimum delay. Adding any rejected path to the set will only increase

the total end-to-end delay. This path selection scheme is useful since although it is

always desirable to use a path with a higher bandwidth and a lower fixed delay, it is

impossible to order the paths consistently according to their bandwidth or fixed delay

in many cases. A brute force optimization testing all the feasible combinations of

the paths would have exponential complexity [62]. With our results, the performance

metrics are translated to the end-to-end delay and the set of paths can be easily

determined with a complexity of O(N ), where N is the number of paths available.

Thus this algorithm is suitable for the cases where the paths are highly dynamic.

The exact optimal partitioning, rather than a heuristic, can be quickly computed and

applied to a snapshot of the varying network. We also present a design to enforce the

optimal partitioning using a number of cascaded leaky buckets, one for each path.

The rest of this chapter is organized as follows. The system model and

the problem statement are presented in Section 3.2. We present the analysis of a

double-path system in Section 3.3, and then extend it to the case of multiple paths

in Section 3.4. Next, we discuss implementation related issues in Section 3.5, and

present some numerical results in Section 3.6 to illustrate the benefit of using the

schemes. The related work is discussed in Section 3.7. Section 3.8 concludes the

chapter.

3.2 Problem Formulation

For simplicity in presentation, we will first consider a realtime multimedia session

using only two paths. The corresponding traffic partitioning model is shown in Fig-

ure 3.1. Let the accumulated realtime traffic in [0, t) be denoted as A(t). We assume

A(t) is regulated by a {σ, ρ} leaky bucket, which implies a deterministic envelope

Page 69: C:/Documents and Settings/Shiwen Mao/My Documents/mydoc ...szm0001/ShiwenMao_dissertation.pdf · viii previous work, our scheme is more realistic, and is easier to implement. Depending

52

process as defined in [126] and [130],

A(t) = ρt+ σ, (3.1)

where ρ is the long-term average rate of the process called the rate factor of A(t), and

σ is the maximum burst size called the burst factor of A(t). The source traffic stream

is then partitioned using a deterministic scheme, as illustrated in Fig 3.2. With this

scheme, the source flow is divided into two substreams, each of which conforms to an

envelope process

Ai(t) = ρit+ σi, i = 1, 2. (3.2)

We have a further constraint A1(t) + A2(t) = A(t), which gives ρ1 + ρ2 = ρ and

σ1 + σ2 = σ. We will discuss the implementation of such a deterministic partitioning

in Section 3.5.

2 2(σ , ρ )

1 1(σ , ρ )

c2

c1

(σ, ρ)

B

1

2

f

f

dc~

Figure 3.1: A traffic partitioning model with two paths.

A(t)^

σ2

σ1ρ 2

ρ 1

t

ρ

σ

Figure 3.2: A deterministic traffic partitioning scheme.

We model the bottleneck link of each path as a work conserving queueing

system with a constant service rate ci, i = 1, 2. This model is justified by a re-

cent work [56], which shows that when an upstream queue serves a large number of

Page 70: C:/Documents and Settings/Shiwen Mao/My Documents/mydoc ...szm0001/ShiwenMao_dissertation.pdf · viii previous work, our scheme is more realistic, and is easier to implement. Depending

53

regulated traffic sources, the queue-length distribution of the downstream queue con-

verges almost surely to the queue length distribution of a simplified queueing system

obtained by removing the upstream queue. To have a stable system, the aggregate

service rate c = c1+c2 should be larger than the mean rate of the data flow, i.e., c > ρ,

if σ > 0. We also assume σ > 0 and ρ ≥ ci, i = 1, 2 to exclude the trivial case where

the flow can be assigned to a path without partitioning. In order to have a stable

queue in each path, the partitioned streams should satisfy ρi < ci if σi > 0, i = 1, 21.

The queueing delay at the bottleneck link of path i is denoted as di, i = 1, 2. The

contribution of all the other links along the path, including the propagation delay, is

represented by a fixed delay fi, i = 1, 2. Thus, the delay along path i, Di, is the sum

of the queueing delay and the fixed delay, i.e.,

Di = di + fi, i = 1, 2. (3.3)

The parameters of the paths may not be constant because of congestion in

the network. Moreover, when a path is broken, a replacement path k may have a

different ck or fk. We assume that ci and fi change on a relatively large timescale,

i = 1, 2. Therefore we can compute the optimal partitioning for a snapshot of the

network, and update the optimal partitioning as network conditions change over time.

At the receiver side, the two substreams are combined in a resequencing

buffer with a finite size B. Then the restored stream is extracted from the buffer and

sent to the application for decoding. Note that the server of the resequencing buffer

is not work conserving. It polls the queue at a fixed rate (e.g., frame rate) for the

packets belonging to the next frame. If the packets are found in the buffer, they are

served at a rate of cd = frame rate× frame size; Otherwise, cd = 0.

The total end-to-end delay Dl is jointly determined by the traffic partition-

ing strategy and the path parameters. We present an optimal traffic partitioning

scheme, using the path parameters as inputs. The end-to-end delay is minimized

under the condition that no loss is allowed. We formulate the problem of optimal

traffic partitioning as a constrained optimization problem and derive its closed form

solution. We examine a two-path session in Section 3.3, and will extend the analysis

1If σi = 0, it is possible to set ρi = ci, i = 1, 2, resulting in a zero queueing delay in path i.

Page 71: C:/Documents and Settings/Shiwen Mao/My Documents/mydoc ...szm0001/ShiwenMao_dissertation.pdf · viii previous work, our scheme is more realistic, and is easier to implement. Depending

54

to a multiple path session in Section 3.4. The pratical implications of this work is

presented in Section 3.5. For convenience, the notation used in the following analysis

is given in Table 3.1.

3.3 Optimal Partitioning with Two Paths

We make no assumption on the service discipline in Subsection 3.3.1, and will derive

a tighter end-to-end delay bound assuming First Come First Serve (FCFS) service

discipline in Subsection 3.3.2. We define the following:

1. Stability conditions: These are the conditions required to get a stable system,

i.e., ρ < c, ρ1 < c1, and ρ2 < c2. A special case of the stability conditions is that

when ρi = ci and σi = 0, the system is still stable, with zero queueing delay.

2. Feasibility conditions: These are the natural constraints on a partition, i.e.,

0 ≤ σi ≤ σ, and 0 < ρi ≤ ρ, i = 1, 2.

We call a partition of the data flow feasible if both the stability conditions and the

feasibility conditions are satisfied.

3.3.1 Optimal Partitioning Based on the Busy Period Bound

Consider a work conserving queue with capcity c. Its input conforms to an envelope

process A(t). If the queue is stable, then the queueing delay is upper bounded by the

maximum busy period of the system [126][130]:

ddef= inf{t ≥ 0 : A(t) − ct ≤ 0}. (3.4)

Substituting (3.1) into (3.4), the busy period bound is found to be:

d = inf{t ≥ 0 : ρt+ σ − ct ≤ 0}=

σ

c− ρ. (3.5)

Page 72: C:/Documents and Settings/Shiwen Mao/My Documents/mydoc ...szm0001/ShiwenMao_dissertation.pdf · viii previous work, our scheme is more realistic, and is easier to implement. Depending

55

Table 3.1: Definition of the Variables Used in the Analysis

A(t): Accumulative traffic of the data flow.

A(t): Envelope process of the data flow.

N : Total number of the paths in this session.

σ: Burst factor of the envelope process.

ρ: Rate factor of the envelope process.

σi: Burst assigned to path i.

ρi: Rate assigned to path i.

ci: Capacity of the path i bottleneck queue.

c: Aggregate capacity of all the paths.

fi: Fixed delay of path i.

di: Queueing delay of path i.

Di: Total delay of path i.

Di: Total delay of path i obtained using (3.15).

Dl: Deadline, or the total end-to-end delay.

B: Resequencing buffer size.

cd: Service rate of the resequencing buffer.

Dmin1 : Minimum of D1(σ) in the double-path analysis.

σ∗i : Optimal burst assignment of path i.

ρ∗i : Optimal rate assignment of path i.

D∗l : Minimum end-to-end delay.

D∗l : Minimum end-to-end delay obtained using (3.15).

σkth: The kth threshold that partitions the parameter space of σ.

ρkth: The kth threshold that partitions the parameter space of ρ.

Page 73: C:/Documents and Settings/Shiwen Mao/My Documents/mydoc ...szm0001/ShiwenMao_dissertation.pdf · viii previous work, our scheme is more realistic, and is easier to implement. Depending

56

The delay on path i is upper bounded by Di = di + fi, i = 1, 2. Consider

two back-to-back, tagged bits, b1 and b2, belonging to the same multimedia frame.

If b1 is transmitted on path 1 and b2 on path 2 at time t, then b1 will arrive at

the resequencing buffer during the time interval (t, t+ D1], and b2 will arrive at the

resequencing buffer during the time interval (t, t+D2], as illustrated in Fig 3.3. When

both bits arrive (as well as all other bits in the same frame), they can be extracted

from the buffer for decoding and display. Thus, Dl = max{D1, D2} upper bounds

the end-to-end delay. If no loss is allowed, the end-to-end delay, Dl, including the

queueing delay at the bottleneck, the fixed delay, and the resequencing delay, is

Dl = max{D1, D2}. (3.6)

timet t+D 1 t+D 2

Dl

timet t+D 2 t+D 1

Dl

Figure 3.3: Determining the end-to-end delay Dl.

Fact III.1: If a partition achieves Dl = max{f1, f2}, it is optimal.

Proof : From (3.6), the minimum end-to-end delay D∗l = minσi,ρi

max{D1, D2} ≥minσi,ρi

max{f1, f2}, since D1 ≥ f1 and D2 ≥ f2.

Remark Intuitively, a delay equal to the fixed delay cannot be improved by traffic

partitioning if both paths are used. We use this fact to decide if a partition is optimal

in the following analysis.

From (3.6), we can formulate the following constrained optimization problem

on minimizing the end-to-end delay.

Min: Dl = max{D1, D2}= max{ σ1

c1 − ρ1

+ f1,σ2

c2 − ρ2

+ f2} (3.7)

Page 74: C:/Documents and Settings/Shiwen Mao/My Documents/mydoc ...szm0001/ShiwenMao_dissertation.pdf · viii previous work, our scheme is more realistic, and is easier to implement. Depending

57

subject to:

σ1 + σ2 = σ

ρ1 + ρ2 = ρ

σi ≥ 0, i = 1, 2

0 < ρi < ci, i = 1, 2

σi = 0, if ρi = ci

. (3.8)

This is a non-linear optimization problem with a set of linear constraints. The entire

feasible region is divided into two subspaces by the surface D1 = D2. D1 is dominant

in one subspace, while D2 is dominant in the other. However, the special structure

of the delay bound Di allows us to solve the problem relative easily.

According to Fact A.2, the delay terms in (3.7) increase monotonically with

σi and ρi, i = 1, 2, respectively. Also observe that the feasible region is a polytope

(i.e., a solid bounded by polygons), since the constraints are linear equations or

inequalities. Within this feasible region,

∇d1 =

[∂d1

∂ρ1

,∂d1

∂σ1

]=

[σ1

(c1 − ρ1)2,

1

c1 − ρ1

]�= 0, (3.9)

∇d2 =

[∂d2

∂ρ2,∂d2

∂σ2

]=

[σ2

(c2 − ρ2)2,

1

c2 − ρ2

]�= 0. (3.10)

Thus the minimum must occur on one of the boundaries or vertices of the feasible

region [55]. The solution to (3.7) and (3.8) is summarized in the following theorem.

Theorem 1: The optimal partition and the achieved minimum end-to-end delay are:

(1) If σ > (c− ρ)|f2 − f1|, D∗l = σ

c−ρ+ min{f1, f2}, and the optimal partition is

{σ∗1, σ

∗2} = {σ, 0}

{ρ∗1, ρ∗2} = {ρ− c2, c2}(3.11)

(2) If σ ≤ (c− ρ)|f2 − f1|, D∗l = max{f1, f2}, and the optimal partition is

{σ∗1, σ

∗2} = {σ, 0}

ρ∗1 = ρ− ρ∗2ρ∗2 ∈ [ρ− c1 + σ

|f2−f1| , c2]

(3.12)

Page 75: C:/Documents and Settings/Shiwen Mao/My Documents/mydoc ...szm0001/ShiwenMao_dissertation.pdf · viii previous work, our scheme is more realistic, and is easier to implement. Depending

58

(3) If f1 = f2 = f , the optimal partition{σ∗

1 , σ∗2} = { c1−ρ1

c−ρσ, c2−ρ2

c−ρσ}

ρ− c2 < ρ∗1 < c1

ρ∗2 = ρ− ρ1

(3.13)

The minimum delay is D∗l = σ

c−ρ+ f , which is identical to that obtained from

a single path session with the same {σ, ρ} flow, a fixed delay f , and the service

rate c = c1 + c2.

Proof : See Appendix A.

Remark When the paths have different fixed delays, the optimal partitioning strategy

is assigning all the burst to the path with a smaller fixed delay, and assigning a rate

that saturates the path with the larger fixed delay. When the two paths have the

same fixed delay, the two paths behaves like a single path with the combined capacity.

3.3.2 The Optimal Partitioning with FCFS Queues

In the previous subsection, we derived the optimal traffic partition for a two path

session using the system busy period bound (3.4) from [130]. Since all traffic will

be cleared after a busy period, clearly the queueing delay is upper bounded by the

system busy period no matter which service discipline is used. If the service discipline

is FCFS, the queueing delay can be further improved as [151]:

d = supt≥0

{inf{τ ≥ 0 : A(t) ≤ c(t+ τ)}}. (3.14)

Fig. 3.4 illustrates the envelope process and the cumulative service of the queueing

system. If the service discipline is FCFS, the delay of a traffic unit arriving at time t

is bounded by τ such that A(t) = c(t+ τ) (e.g., segment EF in Fig. 3.4). Substitute

(3.1) into (3.14), we have:

d = σ/c. (3.15)

This bound is tighter than the system busy period bound (3.5). In addition, it is only

a function of the burst factor σ, i.e., the rate factor ρ has no impact on the queueing

Page 76: C:/Documents and Settings/Shiwen Mao/My Documents/mydoc ...szm0001/ShiwenMao_dissertation.pdf · viii previous work, our scheme is more realistic, and is easier to implement. Depending

59

~d t+ τ

σ

ρcA(t)

tO

FE

t d

F~~

E

-

-F

E

Figure 3.4: Illustration of a tighter delay bound.

delay. We can exploit this fact to improve the minimum delay derived in the previous

section and to simplify the analysis.

Consider the same two-path model in Fig. 3.1. From (3.6) and (3.15), we

can formulate the following constrained optimization problem:

Min: Dl = max{D1, D2}= max{σ1

c1+ f1,

σ2

c2+ f2} (3.16)

subject to:

σ1 + σ2 = σ

ρ1 + ρ2 = ρ

σi ≥ 0, i = 1, 2

0 < ρi < ci, i = 1, 2

σi = 0, if ρi = ci

. (3.17)

The solution to (3.16) and (3.17) is summarized in the following theorem.

Theorem 2: The optimal partition and the minimum end-to-end delay achieved

using the tighter delay bound, d, are:

(1) If σ > c1|f2 − f1|, D∗l = 1

c(σ + c1f1 + c2f2), and the optimal partition is:

{σ∗1, σ

∗2} = {c1

c[σ + c2(f2 − f1)],

c2c

[σ + c1(f1 − f2)]}. (3.18)

(2) If σ ≤ c1|f2 − f1|, D∗l = max{f1, f2}, and the optimal partition is:

{σ∗1, σ

∗2} = {σ, 0}. (3.19)

Page 77: C:/Documents and Settings/Shiwen Mao/My Documents/mydoc ...szm0001/ShiwenMao_dissertation.pdf · viii previous work, our scheme is more realistic, and is easier to implement. Depending

60

(3) If f1 = f2 = f , D∗l = σ

c+ f , which is identical to that obtained from a single

path session with the same {σ, ρ} source, a fixed delay f , and a bandwidth

c = c1 + c2. The optimal partition is σ∗i = ci

cσ, i = 1, 2.

(4) Any partition of ρ satisfying the stability conditions and the feasibility condi-

tions can be used to achieve the above minimum end-to-end delay.

Proof : See Appendix B.

As illustrated in Fig. 3.5, we can divide the parameter space into three

sections I1, I2, and I3 with two threhold values σLth = (c − ρ)|f2 − f1| and σH

th =

c1|f2−f1|, so that σHth−σL

th = (ρ−c2)|f2−f1| > 0. From Theorems 1 and Theorem 2,

D∗l = D∗

l = max{f1, f2} in section I1. In section I2, D∗l = max{f1, f2} < D∗

l =

σc−ρ

+ min{f1, f2}. In section I3,

D∗l − D∗

l =σ

c− ρ+ f1 − 1

c(σ + c1f1 + c2f2)

=c2c

(f1 + f2) +ρσ

c(c− ρ)> 0

Thus both bounds are optimal in I1, and D∗l is tighter than D∗

l in I2 and I3.

I 2 I 3I 1

σ thL σ th

H σ

Figure 3.5: Three regions determined by the system parameters.

3.4 The Optimal Partitioning with Multiple Paths

In this section, we extend the optimal partitioning analysis in the previous section to

the case of multiple paths. We use the tighter delay bound (3.15) for FCFS queues in

Page 78: C:/Documents and Settings/Shiwen Mao/My Documents/mydoc ...szm0001/ShiwenMao_dissertation.pdf · viii previous work, our scheme is more realistic, and is easier to implement. Depending

61

the following analysis. For any set of paths P′i with parameters {c′i, f ′

i}, i = 1, · · · ,M ,

we first do the following:

1. Sort and relabel the paths according to their fixed delays f′i in non-decreasing

order.

2. If paths P′i , P

′i+1, · · ·, and P

′i+k−1 have the same fixed delay, i.e., f

′i = f

′i+1 =

· · · = f′i+k−1, we can lump these paths into a new path i with fi = f

′i and

ci = c′i + c

′i+1 + · · ·+ c

′i+k−1 according to Theorem 2.

3. Relabel the paths again. Then we get a new set of paths Pi with parameters

{ci, fi}, i = 1, · · · , N , and f1 < f2 < · · · < fN .

In the following, we first determine the optimal partitioning scheme for the paths

Pi, i = 1, · · · , N . Then we can further partition the assignments σ∗i and ρ∗i to the k

original paths P′i , P

′i+1, · · ·, and P

′i+k−1 with the same fixed delay f

′i , using Theorem 2.

The N-path traffic partitioning model is shown in Fig. 3.6, with parameters

{ci, fi}, i = 1, · · · , N , and f1 < f2 < · · · < fN . Given a traffic flow conforming

to the {σ, ρ} envelope process (3.1), we need to find the optimal partition {σ∗i , ρ

∗i },

i = 1, · · · , N , that minimizes the end-to-end delay Dl.

(σ, ρ)

B

1 1(σ , ρ )

2 2(σ , ρ )

1

2

1f

2f

fc

c

c

(σ , ρ )

~

N

...... ... ...

N N

c d

N

Figure 3.6: A traffic partitioning model with N paths.

From (3.6) and (3.15), we can formulate the following linearly constrained

optimization problem, denoted as P(N, σ), for a N -path session:

Min: Dl = max{D1, D2, · · · , DN}

Page 79: C:/Documents and Settings/Shiwen Mao/My Documents/mydoc ...szm0001/ShiwenMao_dissertation.pdf · viii previous work, our scheme is more realistic, and is easier to implement. Depending

62

= max{σ1

c1+ f1,

σ2

c2+ f2, · · · , σN

cN+ fN}

(3.20)

subject to: σ1 + σ2 + · · ·+ σN = σ

σi ≥ 0, i = 1, 2, · · · , N. (3.21)

Theorem 3: Let σNth =

∑Ni=1 ci(fN − fi). The solution to the N -path optimization

problem P(N, σ) is given as follows:

(1) Case I: If σ > σNth, the optimal partition that achieves the minimum end-to-end

delay is: σ∗i = ci

c[σ +

∑Nj=1 cj(fj − fi)], i = 1, 2, · · · , N .

(2) Case II: If σ ≤ σNth, the optimal assignment for path N is σ∗

N = 0. The optimal

assignment for the remaining paths are determined by applying this theorem

recursively on P(N −1, σ), i.e., a reduced problem of (3.20) and (3.21) with the

remaining N − 1 paths and a burst of size σ.

Proof : We can set up the following system of linear equations:

1c1

−1c2

0 · · · 0

0 1c2

−1c3

· · · 0

· · · · · · · · · · · · · · ·· · · · · · · · · · · · −1

cN

1 1 1 · · · 1

σ1

σ2

· · ·σN−1

σN

=

f2 − f1

f3 − f2

· · ·fN − fN−1

σ

, (3.22)

and discuss the feasibility of its solution:

σ∗i =

cic{σ +

N∑j=1

[cj(fj − fi)]}, i = 1, · · · , N, (3.23)

with different system parameters. However, with the aid of an intuitive “water-filling”

model, it is possible to solve the optimization problem directly.

In Fig. 3.7 and Fig. 3.8, we model each path i as a bucket with an area of

cross section ci. In addition, each bucket i is pre-loaded with content cifi to a level

fi. If path i is assigned with a burst σi, this is equivalent to filling σi units of fluid

Page 80: C:/Documents and Settings/Shiwen Mao/My Documents/mydoc ...szm0001/ShiwenMao_dissertation.pdf · viii previous work, our scheme is more realistic, and is easier to implement. Depending

63

into bucket i, resulting in a higher level of σi

ci+ fi. Thus, the fluid level of bucket i

represents the delay on path i. With this model, the optimization problem P(N, σ)

is equivalent to filling σ units of fluid into the N buckets, while keeping the highest

level among all the buckets as low as possible.

D i~

c 1

������

������

c 2 c i c N

����

���

���

1f 2f

Nf f i

ifNf

���������

���������

��������������������

��������������������

�� ��������������������������������

��������������������������������

��

-

����

����...

Path N

...

Path i

���

���

Path 1

��

Path 2

Figure 3.7: Problem P(N, σ): The case of σ ≤ σNth.

D~*

l

c 1 c 2 c Nc i

�������� �� ������ ���� ������ ����������

��������������������

��������������������

-

......

Path NPath i

1f 2fif

f N fi

Nf���������

���������

��������

��������

��������������������������������

��������������������������������

��

��

����

����

Path 2

�������

���

����

����

������

Path 1

Figure 3.8: Problem P(N, σ): The case of σ > σNth.

Consider Fig. 3.7. Assume each bucket have a finite depth fN , which is the

highest pre-loaded level of the N buckets. Then the N buckets can hold at most

σNth =

∑Ni=1 ci(fN − fi) units of fluid without an overflow. Note that bucket N cannot

hold any fluid since its level is already fN . Thus, if the burst of data flow, or the

amount of fluid, σ is less than σNth, all the σ units of fluid can be distributed to the

Page 81: C:/Documents and Settings/Shiwen Mao/My Documents/mydoc ...szm0001/ShiwenMao_dissertation.pdf · viii previous work, our scheme is more realistic, and is easier to implement. Depending

64

N − 1 buckets and none of the buckets has a level exceeding fN . Thus the optimal

assignment for path N is σ∗N = 0, and D∗

l ≤ fN . This is Case II of Theorem 3.

On the other hand, if σ > σNth, σ units of fluid cannot be accommodated by

the buckets in Fig. 3.7. In this case, let each bucket have an infinite depth as shown

in Fig. 3.8. Thus an arbitrarily large σ can be held in these buckets. However, in

order to minimize the highest level, the σ units of fluid should be distributed to the

N buckets in such a manner that all the buckets have the same fluid level. If the

common fluid level is D∗l , the amount of fluid that bucket i holds is σ∗

i = ci(D∗l − fi).

Since the total amount of the fluid is σ, the minimum delay D∗l can be derived from

the following equation:

c1(D∗l − f1) + c2(D

∗l − f2) + · · ·+ cN (D∗

l − fN ) = σ. (3.24)

Thus the minimum end-to-end delay is:

D∗l =

1

c(σ +

N∑i=1

cifi). (3.25)

The volume filled into bucket i, or the optimal burst assignment for path i, is:

σ∗i = ci(D

∗l − fi)

=cic

[σ +N∑

j=1

cj(fj − fi)], (3.26)

which is the solution of (3.22). This is Case I of Theorem 3.

So far for Case II, we have derived σ∗N = 0. In order to determine the op-

timal partition for the remaining N − 1 paths, we remove path N from (3.20) and

(3.21). Since σ∗N = 0, removing path N does not affect the constraints in (3.21). Con-

sequently, we obtain a (N −1)-path problem with a burst σ, i.e., P(N −1, σ). Define

σN−1th =

∑N−1i=1 ci(fN−1 − fi). We can model the two cases of P(N − 1, σ) using the

same “water-filling” model as illustrated in Fig. 3.9 and Fig. 3.10. Repeat the above

analysis recursively. If the number of paths is reduced to 2, the two-path results in

Section 3.3.2 can be applied. Thus the optimal partition for all the paths can be deter-

mined.

Page 82: C:/Documents and Settings/Shiwen Mao/My Documents/mydoc ...szm0001/ShiwenMao_dissertation.pdf · viii previous work, our scheme is more realistic, and is easier to implement. Depending

65

D i~

c 1�������� c 2�� ������

����

c i��������

���

���

��

c

1f f 2

f f i

iff

���������

���������

��������������������

��������������������

������

������

��������������������������������

��������������������������������

��

����

����

Path N-1

...

-

...

����

Path 2Path 1

���

���

������

Path i

������

N-1

N-1

N-1

Figure 3.9: Problem P(N − 1, σ): The case of σ ≤ σN−1th .

D~*

l

c 1 c 2 c i c

1f 2fif

����f f i

f���������

���������

������������������������

������������������������

��������

��������

��������������������������������

��������������������������������

��

��

����

����

����

���

������

���

����

����

��

...

Path N-1Path iPath 2Path 1

-

...N-1

N-1

N-1

Figure 3.10: Problem P(N − 1, σ): The case of σ > σN−1th .

Remark According to (3.9) and (3.10), the minimum must occur either on a boundary

of the search space, or at one of the vertices. Indeed, each delay term in (3.20),

Di = σi

ci+ fi, is a plane in the N dimensional search space. In Case I of Theorem 3,

the minimum occurs on a boundary where all the planes intersect at a single point.

In Case II of Theorem 3, we remove a plane which always dominates all other planes.

Using such a plane will only increase the objective function.

The minimum end-to-end delay is jointly determined by the burst assign-

ments and the rate assignments. We first define the following quantities:

ρkth =

k∑i=1

ci, k = 2, 3, · · · , N, (3.27)

Page 83: C:/Documents and Settings/Shiwen Mao/My Documents/mydoc ...szm0001/ShiwenMao_dissertation.pdf · viii previous work, our scheme is more realistic, and is easier to implement. Depending

66

σkth =

k∑i=1

ci(fk − fi), k = 2, 3, · · · , N. (3.28)

Clearly ρith > ρj

th and σith > σj

th, if i > j. These quantities partition the rate line and

the burst line, respectively, as illustrated in Fig. 3.11. Let m be the index such that

ρm−1th ≤ ρ < ρm

th, and k be the index such that σkth < σ ≤ σk+1

th . Then m is the highest

index of the minimum set of paths required to accommodate ρ in order to satisfy the

stability condition, and k is the highest index of the minimum set of paths required

to accommodate σ, according to Theorem 3. If m > k, then the minimum delay is

the fixed delay of path m. Otherwise, the minimum delay is the solution to P(k, σ)

(see (3.25)).

ρ thNρ th

m−1 ρ thm

ρ th

σ thNσ th

2

ρ th2

thρ 3

σ th3 σ th

k σ thk+1

0 ρrate

burst

... ...

... ...0 σ

Figure 3.11: Computing the optimal partition.

Theorem 4: For the above defined m and k,

(1) If m > k, then D∗l = fm.

(2) If m ≤ k, then D∗l = 1

ρkth

(σ +∑k

i=1 cifi).

(3) The optimal burst assignments are:

σi =

ci

ρkth

[σ +∑k

j=1 cj(fj − fi)], if i ≤ k

0, otherwise. (3.29)

(4) The optimal rate assignments are:

ρi =

ci

ρmthρ, if i ≤ m

0, otherwise. (3.30)

Page 84: C:/Documents and Settings/Shiwen Mao/My Documents/mydoc ...szm0001/ShiwenMao_dissertation.pdf · viii previous work, our scheme is more realistic, and is easier to implement. Depending

67

3.5 Practical Implications

In this section, we discuss the important implications of this work in practice and

present a design to enforce the optimal partition for an end-to-end application. The

design uses a set of leaky buckets, which have been implemented by many commercial

routers [63].

3.5.1 Optimal Path Selection

In many routing protocols, a path found may be associated with more than one

performance metrics (e.g., each path has a fixed delay and a capacity in the case we

study). When multiple paths are used, it would be nice to sort the paths according

to their “goodness” and use them starting from the best ones. However, we may get

inconsistent orderings if we sort the paths according to different performance metrics.

For example, a path may have a higher bandwidth but a higher delay, while another

path may have a smaller bandwidth but a lower delay. This inconsistency makes it

very difficult to decide which paths to use. A brute force approach can examine every

feasible combination of the paths but with high computational complexity. Some

heuristics give preference to one performance metric over the other, and use the other

performance metric to break the tie if necessary [26]. Although such heuristics work

well in some cases, it is not clear if they work in all the cases since there is no

analysis supporting them. Theorem 4 shows that we can sort the paths consistently

by translating the two performance metrics (the fixed delay and the bandwidth) into

the total end-to-end delay, which then determines the minimum set of the paths to

be used. The computation complexity is O(N ). The path selection is optimal, since

adding any unused path to this set will only increase the end-to-end delay.

3.5.2 Enforcing the Optimal Partition

When the optimal partition parameters, i.e., {σ∗i , ρ

∗i }, i = 1, 2, · · · , N , are computed,

the next question that arises is how to enforce them. The optimal partion can be

enforced by using a leaky bucket regulator on each path. For a point-to-point appli-

Page 85: C:/Documents and Settings/Shiwen Mao/My Documents/mydoc ...szm0001/ShiwenMao_dissertation.pdf · viii previous work, our scheme is more realistic, and is easier to implement. Depending

68

cation (e.g., the one shown in Fig. 1.1), the sender is responsible for partitioning the

traffic flow. The leaky buckets and the module that computes the optimal partition

should be implemented at the sender side, as illustrated in Fig. 3.12. The multiple

leaky buckets are cascaded in a chain, while the source flow is fed into the first leaky

bucket with parameters {σ∗1, ρ

∗1}. When a flow is regulated by a leaky bucket, usually

the conforming traffic is transmitted, while the nonconforming traffic (i.e., the part

exceeding the constraint of the envelope process) is either marked or dropped. In

our implementation, we simply redirect the nonconforming traffic to the next leaky

bucket, instead of dropping it. The conforming traffic flow of leaky bucket i is then

transmitted on path i. So on and so forth. If h =max{m, k} < N , then the hth

leaky bucket has no nonconforming traffic. Thus the remaining (h+ 1)th, (h+ 2)th,

· · ·, and Nth leaky bucket are not used.

}f{ },{

,{,

nonconfor-ming flows

ff }

Path 2

Path N

Path 1

Path 2

Path 1

σN

Path N

σ

i

,(

i

i

cc i

i

ρ

,computing

i*ρ*

N

i

2σ 2

1

c

ρ

ρ

...

conforming flows

Reseq.

Buffer...

...

Estimate

Figure 3.12: Implementation of the optimal traffic partitioning scheme.

This scheme works best when QoS support is provided in the network. For

example, if the resource reservation protocol (RSVP) [92] is supported, a source can

reserve the required bandwidth along each path, and a router or a switch can use the

Generalized Processor Sharing (GPS) scheduling to guarantee the reserved bandwidth

[93]. Today’s Internet is still best-effort. However, if the path conditions are varying

at a relatively large timescale, the receiver could estimate the path parameters, i.e.,

Page 86: C:/Documents and Settings/Shiwen Mao/My Documents/mydoc ...szm0001/ShiwenMao_dissertation.pdf · viii previous work, our scheme is more realistic, and is easier to implement. Depending

69

ci and fi, i = 1, 2, · · · , N , for a snapshot of the network and send the estimates back

to the source if necessary. For example, ci can be approximated by the ratio of the

amount of traffic received in flow i in the last time window and the duration of the

time window. Before or during data transmission, the sender may send probes on

each path to the receiver to estimate the fixed delays.

3.6 Numerical Results

In this section, we present some numerical results to illustrate the analysis in the

previous sections. In all the figures shown in this section, we vary σ and ρ and then

derive the optimal partition and the minimum delay for each {σ, ρ} pair.

Consider a two path session with f1 = 1, f2 = 3, c1 = 2, and c2 = 1. The

minimum end-to-end delay is given in Fig. 3.13, and the optimal assignment for path

1 is plotted in Fig. 3.14, as derived using Theorem 1. It can be observed that the

delay plane has two regions. If σ < (c − ρ)(f2 − f1), the delay is a plane such that

D∗l = f2, otherwise, the delay is a plane such that D∗

l = σc−ρ

+ f1. These two planes

intersect on the line σ = (c − ρ)(f2 − f1). For any {σ, ρ} pair, we always assign the

entire burst to path 1. Thus σ∗1 in Fig. 3.14 is a plane σ∗

1 = σ and is not affected by

the values of ρ.

In Fig. 3.15, we plot the minimum end-to-end delay derived from Theorem 2.

With the tighter delay bound, the rate factor has no impact on the delay. Therefore,

for a given σ, D∗l is constant for any ρ. As in Fig. 3.13, the minimum delay consists

of two planes. However, when D∗l > f2, it grows linearly as σ increases, since D∗

l =

1c(σ+ c1f1 + c2f2). In order to compare the two minimum delays D∗

l and D∗l , we plot

the difference between these two delays derived from the same set of parameters in

Fig. 3.16. It can be seen that D∗l is always equal to or smaller than D∗

l . The difference

between the two minimum delays increases as either σ or ρ increases. This can be

verified by Fig. 3.4. For a given c, the difference between d and d increases with σ

and ρ, since d− d = σρc(c−ρ)

.

Next, we consider a 5-path session. The fixed delays of the paths are: f1 = 1,

f2 = 2, f3 = 3, f4 = 4, f5 = 5. We examine two cases: (1) Case 1: we assume the

Page 87: C:/Documents and Settings/Shiwen Mao/My Documents/mydoc ...szm0001/ShiwenMao_dissertation.pdf · viii previous work, our scheme is more realistic, and is easier to implement. Depending

70

01

23

45

6

0.5

1

1.5

2

2.52

4

6

8

10

12

14

sigmarho

min

imum

tota

l del

ay

Figure 3.13: The minimum end-to-end delay, D∗l : two paths, f1 = 1, f2 = 3, c1 = 2,

c2 = 1.

capacities of the paths are in an increasing order, with c1 = 1, c2 = 1.5, c3 = 2,

c4 = 2.5, c5 = 3, and (2) Case 2: we assume the capacities of the paths are in a

decreasing order, with c1 = 3, c2 = 2.5, c3 = 2, c4 = 1.5, c5 = 1. The minimum

end-to-end delays for both cases are plotted in Fig. 3.17 and Fig. 3.18, respectively.

Both the minimum delays are step functions in the small σ region, while the height of

the steps are f2, f3, f4, and f5, respectively. That is, the minimum end-to-end delay

increases when a new path with a larger fixed delay is added to the selected path set.

When σ is larger than σ5th, the minimum delay increases linearly with σ.

The optimal burst assignments for path 1 and path 5 in Case 1 are plotted

in Fig. 3.19 and Fig. 3.20, respectively. It can be seen that the burst assignments are

piece-wise linear and concave. However, path 5 is only assigned with a burst when the

total burst cannot be accommodated by the first four paths with a threshold of f5.

The optimal rate assignments for path 1 and path 5 in case 1 are plotted in Fig. 3.21

and Fig. 3.22, respectively. It can be observed that ρ∗1 has a saw-tooth form as ρ

increases. This is because ρ∗1 first increases linearly with ρ, but decreases when there

Page 88: C:/Documents and Settings/Shiwen Mao/My Documents/mydoc ...szm0001/ShiwenMao_dissertation.pdf · viii previous work, our scheme is more realistic, and is easier to implement. Depending

71

01

23

45

6

0.5

1

1.5

2

2.50

1

2

3

4

5

6

sigmarho

min

imum

tota

l del

ay

Figure 3.14: The optimal burst assignment for path 1, σ∗1 : two paths, f1 = 1, f2 = 3,

c1 = 2, c2 = 1.

is a new path with a higher index being used. Path 5 is not used when ρ < ρ5th and

has a zero rate assignment in this region.

Figures 3.23 and 3.24 are the optimal burst and rate assignments for path

1 obtained from Case 2, respectively. Similar trends can be observed from this set of

figures.

In the second case, a path with a lower fixed delay always has a larger

capacity. Thus a path with a smaller index should always be preferred than a path

with a higher index. In the first case, however, a path with a lower fixed delay always

has a lower capacity. This makes the path selection more difficult than the second

case, since if we sort the paths according to their fixed delays, we will get a different

order than that if we sort the paths according to their capacities. With Theorem 4,

we can translate these two performance metrics into a single one, i.e., the end-to-end

delay. The path selection is now easy because we only use the first max{m, k} paths.

Fig. 3.25 plots the highest index of the paths in use (i.e., max{m, k}) when the fixed

delays and the capacities are in the same order. Fig. 3.26 plotted the highest index

Page 89: C:/Documents and Settings/Shiwen Mao/My Documents/mydoc ...szm0001/ShiwenMao_dissertation.pdf · viii previous work, our scheme is more realistic, and is easier to implement. Depending

72

01

23

45

6

0.5

1

1.5

2

2.53

3.1

3.2

3.3

3.4

3.5

3.6

3.7

3.8

sigmarho

min

imum

tota

l del

ay

Figure 3.15: The minimum end-to-end delay, D∗l : two paths, f1 = 1, f2 = 3, c1 = 2,

c2 = 1.

of the paths in use when the fixed delays and the capacities are in reversed order.

For a given set of paths and a flow with {σ, ρ}, the set of paths to use can be easily

determined by applying Theorem 4.

3.7 Related Work

Since the early work [19], traffic dispersion has been studied with different models and

perspectives. A survey on traffic dispersion is presented in [28]. It has been shown

that for data traffic, a packet level dispersion granularity gives a better performance

in terms of delay and network resource utilization than a flow level granularity [87].

The problem of elastic data traffic partitioning for an end-to-end session is

investigated in [26], [33], and [88] using different traffic and path models. In [33],

a two path resequencing model is presented where each path is assumed to be the

combination of an M/M/1 queue and a fixed delay line. It is shown that the optimal

splitting probability may be highly dependent on the difference in the fixed delays of

the two paths. However, the M/M/1 queueing model may not be suitable for realtime

Page 90: C:/Documents and Settings/Shiwen Mao/My Documents/mydoc ...szm0001/ShiwenMao_dissertation.pdf · viii previous work, our scheme is more realistic, and is easier to implement. Depending

73

01

23

45

6

0.5

1

1.5

2

2.5−2

0

2

4

6

8

10

sigmarho

diffe

renc

e be

twee

n tw

o m

inim

um to

tal d

elay

s

Figure 3.16: The difference between two minimum end-to-end delays, i.e., D∗l − D∗

l :

two paths, f1 = 1, f2 = 3, c1 = 2, c2 = 1.

multimedia traffic, which usually has a more complex auto-correlation structure than

the Poisson model. Furthermore, it is not clear how to extend the analysis to more

than two paths. In [26], a proportional routing heuristic is introduced to route traffic

over multiple paths. The heuristic path selection procedures proposed gives near

optimal performance in terms of throughput for elastic data. In a recent work [88],

each path i is assigned with a weight ωi such that∑

i ωi = 1. Opportunistic scheduling

[89] is used to schedule the packets to the multiple paths while keeping the fraction

of bytes transmitted on each path i at ωi. It is shown that the large-time-scale traffic

correlation can be exploited by opportunistic scheduling to reduce the queueing delays

on the paths. However, fixed delays, which may have significant impact on traffic

partitioning [33], are not considered. Moreover, it not clear how to set or derive {ωi},which are the key parameters in this analysis, {ωi}, for a data flow and a set of paths.

Multipath transport is extended to the many-to-one type of applications

in [34]. An analytical model of parallel data downloading from multiple servers is

presented in order to minimize the resequencing buffer size and total download time.

Page 91: C:/Documents and Settings/Shiwen Mao/My Documents/mydoc ...szm0001/ShiwenMao_dissertation.pdf · viii previous work, our scheme is more realistic, and is easier to implement. Depending

74

05

1015

2025

0

2

4

6

8

102

2.5

3

3.5

4

4.5

5

5.5

6

sigmarho

min

imum

tota

l del

ay

Figure 3.17: The minimum end-to-end delay D∗l : five paths, f1 = 1, f2 = 2, f3 = 3,

f4 = 4, f5 = 5, c1 = 1, c2 = 1.5, c3 = 2, c4 = 2.5, c5 = 3.

Although this work has the similar objectives as ours, the analysis is for elastic data

transport and is not applicable for realtime applications where the received packets

are consumed at a certain rate.

In [58], Alasti et al investigated the effect of probabilistic traffic partitioning

on MDC and single description video, using M/M/1 and M/D/1 queueing models

and rate-distortion bounds. It is shown that different splitting probabilities result in

different distortion in the received video. Although the results provide some useful in-

sights, the assumptions made in [58] limit its applicability. Furthermore, propagation

delay, which may be the dominant part of end-to-end delay in high speed networks,

is not considered.

As compared with previous work, our study is unique in a number of aspects.

First, we study the optimal partitioning for realtime or multimedia traffic, rather

than elastic data [26][33][88][34]. There is a compelling need for such an analysis and

practical implementation considering the recent trend on using multipath transport

for realtime multimedia applications [37]-[90]. Second, the {σ, ρ} model used in our

analysis is more general and realistic than that in previous work [58]. The leaky

Page 92: C:/Documents and Settings/Shiwen Mao/My Documents/mydoc ...szm0001/ShiwenMao_dissertation.pdf · viii previous work, our scheme is more realistic, and is easier to implement. Depending

75

05

1015

2025

0

2

4

6

8

102

2.5

3

3.5

4

4.5

5

sigmarho

min

imum

tota

l del

ay

Figure 3.18: The minimum end-to-end delay D∗l : five paths, f1 = 1, f2 = 2, f3 = 3,

f4 = 4, f5 = 5, c1 = 3, c2 = 2.5, c3 = 2, c4 = 1.5, c5 = 1.

bucket regulator is supported by most commercial routers [63]. The deterministic

partitioning used is easier to implement. The optimal partition derived from our

analysis can be easily enforced by deploying a number of cascaded leaky buckets

with the optimal parameters. Third, since we incorporate the fixed delay of each

path in the model, our analysis can demonstrate the impact of fixed delays on the

traffic partitioning. This is especially important in high speed networks where fixed

delays may be a dominant part of the total end-to-end delay. Fourth, our analysis

provides for the closed form optimal solutions for the multiple-path model, rather

than heuristics. Our solution is optimal in the sense that we use the minimal set

of path to achieve the minimum end-to-end delay. Adding any rejected path to

this set will only increase the end-to-end delay and resequencing buffer size. Fifth,

the optimal partitioning is easily computed with low complexity. This makes our

scheme suitable for networks where the path conditions are highly dynamic (e.g,

frequent congestion or link failures). Sixth, our results provide useful guidelines on

path selection, especially when the path performance metrics are contradictory to

each other (i.e., a high bandwidth but high delay). Finally, our analysis provides for

Page 93: C:/Documents and Settings/Shiwen Mao/My Documents/mydoc ...szm0001/ShiwenMao_dissertation.pdf · viii previous work, our scheme is more realistic, and is easier to implement. Depending

76

05

1015

2025

0

2

4

6

8

100

1

2

3

4

5

sigmarho

sigm

a1*

Figure 3.19: The optimal burst assignment for path 1 σ∗1: five paths, f1 = 1, f2 = 2,

f3 = 3, f4 = 4, f5 = 5, c1 = 1, c2 = 1.5, c3 = 2, c4 = 2.5, c5 = 3.

“hard” QoS guarantees for realtime multimedia flows, i.e., the bounds derived will

not be violated. Such bounds are optimal since there is always a situation that the

end-to-end delay is equal to the bound derived in our analysis.

We should note that the benefits of the proposed approach come at the cost

of lower bandwidth utilization, since no loss is allowed and deterministic analysis

is used. The bounds derived are for the worst case, which occurs with a relatively

low probability. However, such “hard” QoS guarantee is necessary for many realtime

computing or multimedia applications [91], e.g., distributed simulations, realtime

visualization of complex scientific simulation results in multiple remote locations,

control and operation of complex scientific instruments and experiments in realtime,

remotely, and stock exchange transactions. The framework presented in this chapter is

useful in such situations. If the application can tolerate a certain loss, then statistical

multiplexing can be used to achieve a higher network resource utilization.

Page 94: C:/Documents and Settings/Shiwen Mao/My Documents/mydoc ...szm0001/ShiwenMao_dissertation.pdf · viii previous work, our scheme is more realistic, and is easier to implement. Depending

77

05

1015

2025

0

2

4

6

8

100

0.5

1

1.5

2

2.5

sigmarho

sigm

a5*

Figure 3.20: The optimal burst assignment for path 5 σ∗5: five paths, f1 = 1, f2 = 2,

f3 = 3, f4 = 4, f5 = 5, c1 = 1, c2 = 1.5, c3 = 2, c4 = 2.5, c5 = 3.

3.8 Summary

In this chapter, we examined two important issues on the use of multipath transport,

namely, how to minimize the resequencing delay and how to select the paths. We

presented a simple model to analyze the optimal traffic partitioning problem for a

given set of paths. We formulated a constrained optimization problem to minimize

the end-to-end delay using deterministic analysis. A two-path session model was first

examined and then was extended to the multiple path session case. Our results show

that by the optimal traffic partitioning, we can either achieve a minimum end-to-end

delay equal to the maximum fixed delay among all the paths, or equalize the delay

bounds of all the paths. The selected path set is optimal in the sense that adding any

rejected path to this set will only increase the end-to-end delay. We also discussed

the important implications of this work in practice, and provided a practical design

to enforce the optimal partition on each path. Our analysis provides a simple and

yet powerful solution to the path selection problem in multipath transport design.

Page 95: C:/Documents and Settings/Shiwen Mao/My Documents/mydoc ...szm0001/ShiwenMao_dissertation.pdf · viii previous work, our scheme is more realistic, and is easier to implement. Depending

78

05

1015

2025

0

2

4

6

8

100.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

sigmarho

rho1

*

Figure 3.21: The optimal rate assignment for path 1 ρ∗1: five paths, f1 = 1, f2 = 2,

f3 = 3, f4 = 4, f5 = 5, c1 = 1, c2 = 1.5, c3 = 2, c4 = 2.5, c5 = 3.

Appendix A: Proof of Theorem 1

As an initial step, we present the following facts derived from deterministic

queueing analysis [126][130][151].

Fact A.1 (Existence of a partition): If 0 < ρ < c, there exists a partition {ρi}i=1,2,

such that ρ1 + ρ2 = ρ, 0 < ρ1 ≤ c1 and 0 < ρ2 ≤ c2.

Proof : The proof is based on a construction. Since ρ < c = c1 + c2, we have

0 < ρ − c2 < c1. Then we can choose ρ1 as: ρ − c2 ≤ ρ1 ≤ c1, and the resulting

ρ2 is ρ2 = ρ − ρ1 ≤ ρ − (ρ − c2) = c2, and ρ2 = ρ − ρ1 ≥ ρ − c1 > 02. That is,

0 < ρ− c1 ≤ ρ2 ≤ c2.

Fact A.2: Under the stability condition ρi < ci, the queueing delay of path i, di,

i = 1, 2, is a monotonically increasing function of both σi and ρi.

This fact can be easily proved since from (3.5), we have

∂di

∂ρi

=σi

(ci − ρi)2> 0,

2Recall that we assume ρ > c1 and ρ > c2 to avoid the trivial cases.

Page 96: C:/Documents and Settings/Shiwen Mao/My Documents/mydoc ...szm0001/ShiwenMao_dissertation.pdf · viii previous work, our scheme is more realistic, and is easier to implement. Depending

79

05

1015

2025

0

2

4

6

8

100

0.5

1

1.5

2

2.5

3

sigmarho

rho5

*

Figure 3.22: The optimal rate assignment for path 5 ρ∗5: five paths, f1 = 1, f2 = 2,

f3 = 3, f4 = 4, f5 = 5, c1 = 1, c2 = 1.5, c3 = 2, c4 = 2.5, c5 = 3.

∂di

∂σi=

1

ci − ρi> 0.

Fact A.3: Given any 0 ≤ ρ < c, there exists a feasible σ > 0 that achieves a given

delay bound d. Similarly, given a 0 < σ ≤ cd, there exists a feasible 0 ≤ ρ < c that

achieves the given delay bound d. If σ > cd, there is no feasible envelope process that

can achieve the delay bound d.

Proof : For a given delay bound d and ρ < c, from (3.5), we have σ = d(c− ρ) > 0.

Similarly, for a given 0 < σ < cd, we can solve ρ from (3.5) as 0 < ρ = c− σd< c.

If σ > cd, then ρ < c−cd/d = 0. Recall that ρ is the average rate of a traffic

process, which cannot be negative.

Remark This fact is illustrated in Fig. 3.27. For the same service rate c and delay

bound d, as long as σ < cd, we can always determine the rate factor ρ of the envelope

process that results in the delay bound. If σ > cd, any feasible envelope process with

that burst factor will have a larger delay bound than d. This useful property allows

us to separate σi and ρi when determining their optimal values.

Without loss of generality, we assume f2 ≥ f1 throughout this section. In the

Page 97: C:/Documents and Settings/Shiwen Mao/My Documents/mydoc ...szm0001/ShiwenMao_dissertation.pdf · viii previous work, our scheme is more realistic, and is easier to implement. Depending

80

05

1015

2025

0

2

4

6

8

100

2

4

6

8

10

12

sigmarho

sigm

a1*

Figure 3.23: The optimal burst assignment σ∗1: five paths, f1 = 1, f2 = 2, f3 = 3,

f4 = 4, f5 = 5, c1 = 3, c2 = 2.5, c3 = 2, c4 = 1.5, c5 = 1.

following, we first determine the range of {ρi}i=1,2 satisfying the stability conditions

and the feasibility conditions, e.g., from Fact A.1, 0 < ρ− c2 < ρ1 ≤ c1

ρ2 = ρ− ρ1, and 0 < ρ− c1 < ρ2 ≤ c2. (3.31)

Then, for a given pair of {ρ1, ρ2} satisfying (3.31), the two delay terms in (3.7) are

plotted in Fig. 3.28 as functions of σ1. As shown in Fig. 3.28(a), D1(σ1) intersects

with the D1 axis at D1(0) = f1, while D2(σ1) intersects with the D2 axis (to the

right) at D2(σ) = f2. The intersection of these two lines gives the optimal σ∗1 and

the minimum delay D∗l . Moreover, the slopes of these two lines are determined by

ρ1. If we increase ρ1, the slope of D1(σ1),1

c1−ρ1, increases, while the slope of D2(σ1),

−1c2−ρ+ρ1

, decreases. By varying ρ1 within its feasible region, we can find the lowest

crosspoint, which is the solution of (3.7).

The existence and the location of the crosspoint is determined by the pa-

rameters of the paths and the source flow, i.e., {ci, fi}, i = 1, 2, and {σ, ρ}. In other

words, the system parameters determine how much the delay curves vary by increas-

ing ρ1 from its feasible minimum to the maximum. In Fig. 3.29(a), we plot D1(σ) as

Page 98: C:/Documents and Settings/Shiwen Mao/My Documents/mydoc ...szm0001/ShiwenMao_dissertation.pdf · viii previous work, our scheme is more realistic, and is easier to implement. Depending

81

05

1015

2025

0

2

4

6

8

100

0.5

1

1.5

2

2.5

3

sigmarho

rho1

*

Figure 3.24: The optimal rate assignment ρ∗1: five paths, f1 = 1, f2 = 2, f3 = 3,

f4 = 4, f5 = 5, c1 = 3, c2 = 2.5, c3 = 2, c4 = 1.5, c5 = 1.

a function of ρ1. It can be seen that D1(σ) is always increasing with ρ1, and the min-

imum, Dmin1 = σ

c−ρ+ f1, is achieved when ρ1 = ρ− c2. As illustrated in Fig. 3.29(b),

by increasing ρ1 from ρ−c2 to c1, we get a set of D1(σ1) curves with increasing slopes.

But all these curves are lower bounded by the one that intersects with the D2 axis at

Dmin1 .

If f2 ≥ Dmin1 , or σ ≤ (c − ρ)(f2 − f1), we can always choose a ρ1 to make

the two delay curves intersect at the point [σ, f2]. Thus, the minimum total end-to-

end delay is f2, which is optimal according to Fact III.1. This case is illustrated in

Fig. 3.28(b) and Fig. 3.28(c). Therefore, the minimum end-to-end delay in this case

is

D∗l = f2 = max{f1, f2} (3.32)

while the optimal partition of the burst factor σ is:

{σ∗1 , σ

∗2} = {σ, 0} (3.33)

From D1 = f2 = σc1−ρ1

+ f1, we can derive ρ∗1 = c1 − σf2−f1

and ρ∗2 = ρ − ρ∗1 =

Page 99: C:/Documents and Settings/Shiwen Mao/My Documents/mydoc ...szm0001/ShiwenMao_dissertation.pdf · viii previous work, our scheme is more realistic, and is easier to implement. Depending

82

05

1015

2025

0

2

4

6

8

102

2.5

3

3.5

4

4.5

5

sigmarho

the

high

est i

ndex

of t

he p

aths

use

d

Figure 3.25: The highest index of the paths used: five paths, f1 = 1, f2 = 2, f3 = 3,

f4 = 4, f5 = 5, c1 = 1, c2 = 1.5, c3 = 2, c4 = 2.5, c5 = 3.

ρ − c1 + σf2−f1

. This partition achieves D∗1 = D∗

2 = D∗l = f2, as illustrated in

Fig. 3.28(b).

Moreover, since σ2 = 0, any ρ2 ≤ c2 gives a zero queueing delay, i.e., d2 = 0.

Thus ρ∗2 can have any value between ρ − c1 + σf2−f1

and c2, which gives a smaller ρ∗1and D∗

1 < f2 as illustrated in Fig. 3.28(c). Note that [ρ− c1 + σf2−f1

, c2] is non-empty,

since c2 − (ρ− c1 + σf2−f1

) = c− ρ− σf2−f1

> 0 from the assumption that f2 ≥ Dmin1 .

Thus the optimal partition in this case is: ρ∗1 = ρ− ρ∗2ρ∗2 ∈ [ρ− c1 + σ

f2−f1, c2]

. (3.34)

On the other hand, if f2 < Dmin1 , or σ > (c−ρ)(f2−f1), the two curves must

intersect at an inner point, as illustrated in Fig. 3.28(a). In this case, the resulting

Dl is a function of ρ1, which may be further reduced by varying ρ1 within its feasible

range. We first derive {σi}, i = 1, 2, by equalizing the delays,σ1

c1−ρ1+ f1 = σ2

c2−ρ2+ f2

σ1 + σ2 = σ(3.35)

Page 100: C:/Documents and Settings/Shiwen Mao/My Documents/mydoc ...szm0001/ShiwenMao_dissertation.pdf · viii previous work, our scheme is more realistic, and is easier to implement. Depending

83

05

1015

2025

0

2

4

6

8

102

2.5

3

3.5

4

4.5

5

sigmarho

the

high

est i

ndex

of t

he p

aths

use

d

Figure 3.26: The highest index of the paths used: five paths, f1 = 1, f2 = 2, f3 = 3,

f4 = 4, f5 = 5, c1 = 3, c2 = 2.5, c3 = 2, c4 = 1.5, c5 = 1.

which gives: σ1 = c1−ρ1

c−ρ[σ + (c2 − ρ2)(f2 − f1)]

σ2 = c2−ρ2

c−ρ[σ + (c1 − ρ1)(f1 − f2)]

. (3.36)

From the assumption f2 < Dmin1 , we get σ > (c− ρ)(f2 − f1) = (c1 − ρ1)(f2 − f1) +

(c2 − ρ2)(f2 − f1). Thus the feasible conditions on the burst assignments are met.

The end-to-end delay achieved by (3.36) is:

D1 = D2 =σ + (c1 − ρ1)f1 + (c2 − ρ+ ρ1)f2

c− ρ, (3.37)

which may be further reduced by varying ρ1 within its feasible region. Observe that

(3.37) is an increasing function of ρ1. Thus the minimum delay can be achieved by

using the smallest ρ1, i.e., ρ∗1 = ρ− c2, and

D∗l =

σ

c− ρ+ min{f1, f2}. (3.38)

The partition achieving this minimum is found by substituting ρ∗1 = ρ−c2 into (3.36),

which is {σ∗1, σ

∗2} = {σ, 0}

{ρ∗1, ρ∗2} = {ρ− c2, c2}. (3.39)

Page 101: C:/Documents and Settings/Shiwen Mao/My Documents/mydoc ...szm0001/ShiwenMao_dissertation.pdf · viii previous work, our scheme is more realistic, and is easier to implement. Depending

84

(a) (b)

d t

ρ

d t

σσ

ρ

c

d t

(d)(c)

d t

σ

ρ

c

c

c

A(t)σ

A(t)

A(t) A(t)

ρ<0

cd

Figure 3.27: Different {σ, ρ} assignments give the same delay bound d.

If f1 = f2 = t, i.e., the two paths have the same fixed delay, the two delay

curves must intersect in a inner point, as illustrated in Fig. 3.28(d). From (3.36),{σ∗

1, σ∗2} = { c1−ρ1

c−ρσ, c2−ρ2

c−ρσ}

ρ− c2 < ρ∗1 < c1

ρ∗2 = ρ− ρ∗1

. (3.40)

Substitute (3.40) into (3.7), the minimum delay is:

D∗l =

σ

c− ρ+ t. (3.41)

Note that any feasible {ρ1, ρ2} achieves the minimum delay. This is also verified in

Fig. 3.28(d), where the crosspoints of the two delay curves are always on the horizontal

line EF , for any feasible ρ1 chosen.

Page 102: C:/Documents and Settings/Shiwen Mao/My Documents/mydoc ...szm0001/ShiwenMao_dissertation.pdf · viii previous work, our scheme is more realistic, and is easier to implement. Depending

85

σ1 σ1

1D D2 1D D2

σ1

1

σ1

(D )1 1σ (σ1)2D1D D2

D*l

(D )1 1σ

(σ1)2D(D )1 1σ(σ1)2D

σ1

(D )1

D

σ

(σ1)2

σ

1D D2

1f

1f

1f

1f

f 2

f 2

f 2

f 2

0

(c)

F

(d)

σ0

E

σ0 σ

(a) (b)

0

*

Figure 3.28: The delay curves with different system parameters.

Appendix B: Proof of Theorem 2

Assume f2 ≥ f1 throughout this section. Similarly, we plot the delays D1

and D2 as functions of σ1 in Fig. 3.30. With the tighter bound d, the delay curves

are only the functions of σ1, i.e., their slopes are fixed.

If D1(σ) ≥ f2, i.e., σ ≥ c1(f2 − f1), the two lines intersect with each other,

which gives the optimal partition and the minimum end-to-end delay, as illustrated

in Fig. 3.30(a). Thus, we can solveσ1

c1+ f1 = σ2

c2+ f2

σ1 + σ2 = σ(3.42)

to derive the optimal partition, as:

{σ∗1 , σ

∗2} = {c1

c[σ + c2(f2 − f1)],

c2c

[σ + c1(f1 − f2)]}. (3.43)

Both σ∗1 and σ∗

2 should be positive in order to get a feasible partition. From the

assumption σ ≥ c1(f2 −f1), it is easy to verify that the feasibility conditions are met.

Page 103: C:/Documents and Settings/Shiwen Mao/My Documents/mydoc ...szm0001/ShiwenMao_dissertation.pdf · viii previous work, our scheme is more realistic, and is easier to implement. Depending

86

D 1min

D1D2

σ1

D1(σ1)

D 1minρ

1

1ff 2

f 2

)σ(D1

ρ 1 0ρ c-c 2 1

(a) (b)

Figure 3.29: The Dmin1 curve and its relationship with f2.

σ1 σ1

1D D2 1D D2

(D )1 1σ

σ1

1

f

(D )1 1σ

(σ1)2D(σ

2

)2D

1f

0

f

1f

2

(a) (b)

σ0 σ

*

~ ~ ~ ~

~

~~

~

Figure 3.30: The delay curves with different system parameters and the tighter delay

bound.

With this partition, the minimum end-to-end delay is:

D∗l =

1

c(σ + c1f1 + c2f2). (3.44)

Note that the D∗l in (3.44) is larger than f2, since D∗

l − f2 = 1c[σ − c1(f2 − f1)] > 0.

On the other hand, if D1(σ) < f2, i.e., σ < c1(f2 − f1), the two delay curves

cannot intersect with each other, as shown in Fig. 3.30(b). This is quite different

from Fig. 3.28(b) and (c), where we can always adjust ρ1 to make the two lines meet

at point [σ, f2]. The minimum end-to-end delay of this case is

D∗l = f2 = max{f1, f2}. (3.45)

The feasible partition that achieves this minimum delay is:

{σ∗1, σ

∗2} = {σ, 0}. (3.46)

Page 104: C:/Documents and Settings/Shiwen Mao/My Documents/mydoc ...szm0001/ShiwenMao_dissertation.pdf · viii previous work, our scheme is more realistic, and is easier to implement. Depending

87

Since the rate factor has no effect in the queueing delay, any partition of

ρ satisfying the stability conditions and the feasibility conditions can be used in

both cases. For example, we can choose the following rate partition to equalize the

utilizations of the paths:

{ρ∗1, ρ∗2} = {c1cρ,c2cρ}. (3.47)

Page 105: C:/Documents and Settings/Shiwen Mao/My Documents/mydoc ...szm0001/ShiwenMao_dissertation.pdf · viii previous work, our scheme is more realistic, and is easier to implement. Depending

88

Chapter 4

The Multi-flow Realtime Transport Protocol

4.1 Motivation

In the previous chapters, we showed that, in addition to traditional error control

schemes (such as FEC and ARQ), path diversity provides a new dimension for video

coding and transport design [5]-[8]. Multipath transport has strong potential in ad

hoc networks, where link bandwidth may fluctuate and paths are unreliable. Using

multiple paths can provide higher aggregate bandwidth, better error resilience, and

load balancing for a multimedia session. Similar observations were made in wireline

networks for audio streaming [65] and video streaming using multiple servers [39].

In wireline networks, multiple paths can be set up using SCTP sockets, or by source

routing (supported in IPv4 and IPv6). In addition, data partitioning techniques, such

as striping [71] and thinning [72], have been demonstrated to improve the queueing

performance of realtime data. Using multiple paths for realtime transport provides

a novel means of traffic partitioning and shaping, which can reduce the short term

correlation in realtime traffic and thus improve the queueing performance of the un-

derlying network [72][73].

In this chapter, we present a new protocol, named the Multi-flow Real-

time Transport Protocol (MRTP), supporting the general architecture for multime-

dia transport using multiple paths shown in Fig. 1.1. MRTP is a transport protocol

siting in the application layer. It should be implemented in the user space. Given

multiple paths maintained by an underlying multipath routing protocol, MRTP and

Page 106: C:/Documents and Settings/Shiwen Mao/My Documents/mydoc ...szm0001/ShiwenMao_dissertation.pdf · viii previous work, our scheme is more realistic, and is easier to implement. Depending

89

its companion control protocol, the Multi-flow Realtime Transport Control Protocol

(MRTCP), provide essential support for multiple path realtime transport, includ-

ing session and flow management, data partitioning, traffic dispersion, timestamping,

sequence numbering, and Quality of Service (QoS) feedback.

One natural question arises is that can any of the current existing protocols

provides the same support, i.e., do we really need such a new protocol? There are

two existing protocols that are closely related to our proposal. One is the Realtime

Transport Protocol (RTP) [14]. RTP is a multicast-oriented protocol for Internet

realtime applications. RTP itself does not support the use of multiple flows. An

application could implement multipath realtime transport using RTP, but it would

have to perform all the overhead functions of managing multiple flows, partitioning

traffic, etc. Usually a RTP session uses a multicast tree and an entire audio or video

stream is sent on each edge of the tree. Compared with RTP, MRTP provides more

flexible data partitioning and uses multiple paths for better queueing performance and

better error resilience. The use of multiple flows makes MRTP more suitable for ad

hoc networks, where routes are ephemeral. When multiple disjoint paths are used for a

realtime session, the probability that all the paths fail simultaneously is relatively low,

making better error control possible by exploiting the path diversity [8]. In addition,

using multiple flows makes the realtime traffic more evenly distributed, resulting in

lower queueing delay, smaller jitter, and less buffer overflow at an intermediate node.

Furthermore, RTP focuses on multicast applications, where feedback is suppressed to

avoid feedback explosion [14]. For example, RTP Receiver Reports (RR) or Sender

Reports (SR) are sent at least 5 seconds apart. Considering the typical lifetime of an

ad hoc route, this is too coarse for the sender to react to path failures. With MRTP,

since only a few routes are in use, it is possible to provide much timely feedback,

enabling the source encoder and the traffic allocator to quickly adapt to the path

changes, e.g., to perform mode selection for each video frame or even macroblock, to

retransmit a lost packet, or to disperse packets to other better paths. In fact, MRTP

is a natural extension of RTP exploiting path diversity in ad hoc networks.

The other closely related protocol is the Stream Control Transport Protocol

(SCTP) [74]. SCTP is a message-based transport layer protocol initially designed for

Page 107: C:/Documents and Settings/Shiwen Mao/My Documents/mydoc ...szm0001/ShiwenMao_dissertation.pdf · viii previous work, our scheme is more realistic, and is easier to implement. Depending

90

reliable signaling in the Internet (e.g., out-of-band control for Voice over IP (VoIP)

call setup or teardown). One attractive feature of SCTP is that it supports multi-

homing and multi-streaming, where multiple network interfaces or multiple streams

can be used for a single SCTP session. With SCTP, generally one primary path

is used and other paths are used as backups or retransmission channels. There are

several recent papers proposing to adapt SCTP to use multiple paths simultaneously

for data transport [36][75]. SCTP cannot be applied directly for multimedia data

because it has no such functions as timestamping and QoS feedback services. With

MRTP, the design is focused on supporting realtime applications, with timestamping

and QoS feedback as its essential modes of operation. Moreover, since SCTP is a

transport layer protocol and is implemented in the system kernel, it is hard, if not

impossible, to make changes to it. A new multimedia application, with a new coding

format, a new transport requirement, etc., could only with difficulty be supported by

SCTP. MRTP is largely an application layer protocol and is implemented in the user

space as an integral part of an application. New multimedia services can be easily

supported by defining new profiles and new extension headers. Indeed, MRTP is

complementary to SCTP by supporting realtime multimedia services using multiple

paths. MRTP can establish multiple paths by using SCTP sockets, taking advantage

of the multi-homing and the multi-streaming features of SCTP. In this case, one or

multiple MRTP flows can be mapped to a SCTP stream. MRTP is also flexible in

working with other multipath routing protocols, e.g., the Multipath Dynamic Source

Routing protocol in [8], when their implementations are available.

For the above reasons we believe a new protocol tailored to multimedia trans-

port using multiple paths is in need. The new protocol, MRTP/MRTCP, is a natural

extension of RTP/RTCP, exploiting path diversity to combat high transmission errors

and frequent topology changes found in ad hoc networks. It is also complementary to

SCTP, providing the essential functionality and flexibility in supporting multimedia

services.

The rest of the chapter is organized as follows. In Section 4.2, we present

the background and related work. In Section 4.3, we define MRTP and MRTCP

and present their usage scenarios. Then we present the performance analysis of

Page 108: C:/Documents and Settings/Shiwen Mao/My Documents/mydoc ...szm0001/ShiwenMao_dissertation.pdf · viii previous work, our scheme is more realistic, and is easier to implement. Depending

91

MRTP/MRTCP in Section 4.4. Section 4.5 is a summary of the chapter.

4.2 Background and Related Work

4.2.1 Traffic Partitioning

To transport realtime data using multiple paths, the original multimedia stream

should be divided into several substreams, one for each path in use. In addition

to generating multiple substreams, traffic partitioning also changes the correlation

structure of the stream. The majority of realtime media traffic sources can be char-

acterized by rate burstiness, which is present over multiple time and space aggregation

scales. This manifests itself in a so-called Long Range Dependence (LRD) [76]. How-

ever, it is the short-term correlation structure of the input traffic within a critical

time scale (CTS) of the queuing system that affects the queuing performance most

[77]. The two processes we introduce below lower the auto-correlation of the multi-

media stream within the short time scales. They are suitable for the streaming of

pre-encoded realtime data, where the data in a time window is known.

Block-Based Thinning

Block-based traffic thinning is a traffic partitioning scheme used for pre-encoded mul-

timedia streaming first described in [72]. With block-based thinning, a video sequence

X[n] is divided into equal-sized blocks of length B. From the application’s perspec-

tive, the block size is in number of video frames or audio frames or some other

application-specific temporal payload units, and is generally variable in byte [73].

With block-based thinning, a block-based thinned sequence Xi[n] is assem-

bled selectively from the original blocks X[n] in an increasing order, while zeros, i.e.,

blocks of zero bits, are inserted into those skipped blocks. More specifically, the ith

substream thinned from the original stream X[n] can be expressed as:

Xi[n] =

X[n], if (n mod S) = i,

0, otherwise., (4.1)

Page 109: C:/Documents and Settings/Shiwen Mao/My Documents/mydoc ...szm0001/ShiwenMao_dissertation.pdf · viii previous work, our scheme is more realistic, and is easier to implement. Depending

92

An example of this scheme with S = 2 is given in Fig. 4.1, where two thinned

streams are produced from the original stream in a round-robin manner. The thinning

parameter, S, is used to select blocks from the original stream, and the block size B

determines the granularity of the partitioned traffic.

...zeros zeros zeros0 2 4 6

...

...

time

X[n] 0 2 4 6

zeros zeros zeros zeros

1 3 5

1 3 5

S B

~

~0

1X [n]

X [n]

Figure 4.1: A video thinning example.

Striping

Striping is a technique for data storage and retrieval in distributed systems [67][68].

With striping, data is optionally partitioned and then stored on multiple storage

elements (SEs, or servers), and a client can download data in paralle from these

servers. Thus user requests are more evenly distributed among the servers, resulting

in a better scalability and lower delay. Striping also makes data downloading more

robust to single server failures.

Consider the case of distributed media (e.g., video) servers. A user first find

the servers with the target video clip by using the index service from a catalog server.

Then, it initiates multiple connections to download a piece of the video from each of

the servers. The downloaded multiple streams are combined to restore the original

stream at the client side [71][72]. When web caching is used, a server may be a proxy

server with the video clip cached. Similar applications can be supported in peer-to-

peer (P2P) networks (where a peer with the target video in its public directory is

Page 110: C:/Documents and Settings/Shiwen Mao/My Documents/mydoc ...szm0001/ShiwenMao_dissertation.pdf · viii previous work, our scheme is more realistic, and is easier to implement. Depending

93

equivalent to a mirror server), and ad hoc networks (where a mobile node with the

video in its cache is equivalent to a mirror server).

Consider the output port of a multimedia server supporting striping, where

N flows belonging to N different clients are multiplexed. The multiplexed stream,

X[n] consists of blocks of data from different clients. In the example given in Fig. 4.2,

there are four servers, from which four clients are downloading different video clips.

The first client, whose traffic is denoted as flow1, chooses the first video block from

server1, the second block from server2, the third block from server3, the fourth block

from server4, and then the fifth block from server1 again. More specifically, let

Xi(n), n = 1, 2, · · · , be the entire video sequence client i is downloading from the S

servers, the multiplexed stream from a striping server can be expressed as:

X[n] =

Xi[n], if (n mod S) = i

Xk[n], k �= i, otherwise, (4.2)

where Xk[n] is a block from other users who are also downloading video from this

server.

...

...

...

...

Server 1

Server 2

Server 3

Server 4

S B

time

flow 1, B1

flow 2, B1

flow 3, B1

flow 4, B1

flow 4, B2

flow 1, B2

flow 2, B2

flow 3, B2

flow 3, B3

flow 4, B3

flow 1, B3

flow 2, B3

flow 2, B4

flow 3, B4

flow 4, B4

flow 1, B4

flow 1, B5

flow 2, B5

flow 3, B5

flow 4, B5

Figure 4.2: A video striping example.

4.2.2 Multi-stream Coding and Multipath Transport

Traffic partitioning can also be performed by an encoder. With a given multimedia

stream, a multistream encoder can generate multiple compressed streams, each being

Page 111: C:/Documents and Settings/Shiwen Mao/My Documents/mydoc ...szm0001/ShiwenMao_dissertation.pdf · viii previous work, our scheme is more realistic, and is easier to implement. Depending

94

a partial representation of the original stream. Then, the streams can be assigned to

the multiple paths by the traffic allocator. For interactive multimedia applications,

e.g., video teleconferencing, where multimedia data is generated in real time, multi-

stream coding is a natural choice for generating multiple flows. The video coding and

transport schemes discussed in Chapter 2 are good examples of such approach.

MPT is repsonsible for delivering the multiple streams to the destination(s).

It provides functions such as multipath routing, traffic allocation, flow and session

management, QoS monitoring and feedback, timestamping and sequence numbering,

and resequencing, for a realtime session. The considerations in MPT design are

discussed in Chapters 1 and 2.

4.3 The Multiflow Realtime Transport Protocol

4.3.1 MRTP Overview

MRTP provides a framework for applications to transmit realtime data. The trans-

port service provided by MRTP is end-to-end using the association of multiple flows.

A companion control protocol, MRTCP, provides the essential session/flow control,

traffic transport engine, and QoS feedback.

Figure 4.3 illustrates a MRTP session, where multiple flows are used. The

sender first partitions the realtime stream into several substreams. Applications can

make the choice of data partitioning method and the associated parameters based on

particular requirements. Then, each substream is assigned to one or multiple flows by

a traffic allocator, and traverses a path, partially or fully disjoint, with other flows to

the receiver. The receiver reassembles the multiple flows received using a resequencing

buffer for each flow. Packets from the flows are put into the right order using the

timestamps, the sequence numbers, and other information carried in the headers.

The protocol stack architecture of MRTP is shown in Fig. 4.4. MRTP uses

the UDP datagram service or the multihoming/multistreaming transport service of

SCTP for data and control. Note that the session/flow management function can also

be performed using the Session Initiation Protocol (SIP) [82] over TCP. An underlying

Page 112: C:/Documents and Settings/Shiwen Mao/My Documents/mydoc ...szm0001/ShiwenMao_dissertation.pdf · viii previous work, our scheme is more realistic, and is easier to implement. Depending

95

...

...

...

X [n]k

~

2X [n]~

1X [n]~

TrafficPartitioning

ReassemblyProcess

... ... ...

X’[n]X[n]

Network

Figure 4.3: A usage scenario of MRTP.

multipath ad hoc routing protocol maintains multiple paths from the source to the

destination. When SCTP is used in the transport layer, SCTP sockets can be used

to set up multiple flows.

Multimedia Applications

MRTP/MRTCP

UDP SCTP TCP

IPv4/IPv6, Multipath Routing

Figure 4.4: Positioning of MRTP/MRTCP in the TCP/IP protocol stack.

4.3.2 Definitions

The following terms are used in the description of the proposed protocol:

• Flow: An end-to-end sequence of packets traversing a path partially or fully

disjoint with other paths to the receiver.

• Session: The collection of MRTP flows over which an end-to-end service is

Page 113: C:/Documents and Settings/Shiwen Mao/My Documents/mydoc ...szm0001/ShiwenMao_dissertation.pdf · viii previous work, our scheme is more realistic, and is easier to implement. Depending

96

realized.

• Association: A collection of IP addresses and port numbers, associated with the

flows of a MRTP/MRTCP session.

Note that when SCTP or an underlying multiple path routing protocol are not avail-

able, there will be only one flow in the MRTP session. In this case, MRTP reduces

to RTP in functionality. The basic components of the proposed protocol are:

Session and Flow Management

Unlike RTP, MRTP is a session-oriented protocol. A MRTP session should be estab-

lished first by MRTCP, where two end nodes exchange information such as available

paths, session/flow IDs, and initial sequence numbers. Since the paths, e.g., in an

ad hoc network, may be ephemeral, the set of flows used in a session would not be

static. During data transmission, a new flow may be added to the session when a

better path is found, and a stale flow may be removed from the session based on QoS

reports.

A session has a unique and randomly generated ID, and each flow in the ses-

sion has a unique and randomly generated flow ID as well. All packets belonging to

the session carry the session ID, and packets belonging to a flow carry the correspond-

ing flow ID. MRTCP defines a set of messages for the session and flow management.

These messages can be sent through a pair of UDP sockets, as in RTCP. Alternatively,

the session initiation and control capability of SIP can be used for MRTP session and

flow control.

Traffic Partitioning

With MRTP, a traffic allocator partitions and disperses the realtime traffic to multiple

flows. A basic traffic partitioning and dispersion scheme is provided in MRTP, which

assigns the packets to the multiple flows using the round-robin algorithm. This simple

assignment may not be optimal for some applications and can be overridden in such

situations. Traffic can be assigned to multiple flows with a granularity of packet,

Page 114: C:/Documents and Settings/Shiwen Mao/My Documents/mydoc ...szm0001/ShiwenMao_dissertation.pdf · viii previous work, our scheme is more realistic, and is easier to implement. Depending

97

frame, group of pictures, or substream. Possible traffic partitioning schemes are

discussed in section 4.2.1. Traffic partitioning not only divides the realtime data into

multiple flows, but also provides a means for traffic shaping to achieve load balancing

and better queueing performance.

Timestamping and Sequence Numbering

These functions are similar to those of RTP [14]. The Timestamp carried in a MRTP

data packet denotes the sampling instance of the first byte in its payload. The

sequence number indicates the relative positioning of this packet in the entire packet

stream assigned to the flow. Each flow is assigned a randomly generated initial

sequence number during the session set up time, or when the flow is added into

an ongoing session. The flow sequence number is increased by one for each packet

transmitted in this flow.

QoS Reports

As RTP, MRTP generates QoS reports periodically. A MRTP SR or RR carries both

the per-flow statistics and session statistics. In RTP, QoS reports are transmitted

at a rate of one report per T = max{Td, 5} seconds, where Td is dynamically up-

dated based on the current number of participants in the multicast group and the

bandwidth used for the session (Appendix A.7 of [14]). This algorithm effectively

keeps the the bandwidth used by feedback to a relatively constant ratio of the total

bandwidth of the RTP session, although the number of participants may vary during

the session. However, such a feedback is not prompt enough for the sender to adapt

to the quick changing topology of an ad hoc network, or congestion fluctuations in

the Internet. Unlike RTP, the MRTP SR and RR can be sent at an interval set by

the application. For point-to-point and parallel downloading applications (see Fig.4.3

and Fig.4.12(a)), RR and SR could be sent for each frame since the number of the

participants are relatively small. The timely QoS reports enable the sender to quickly

adapt to transmission errors. For example, the encoder can change the coding pa-

rameters or encoding mode for the next frame, introducing more (or less) redundancy

Page 115: C:/Documents and Settings/Shiwen Mao/My Documents/mydoc ...szm0001/ShiwenMao_dissertation.pdf · viii previous work, our scheme is more realistic, and is easier to implement. Depending

98

for error resilience, or the traffic allocator can avoid the use of a stale flow.

Reassembly at the Receiver

The receiver uses a reassembly buffer for each MRTP flow to reorder the received

packets, using the flow sequence numbers in the packet headers. The original real-

time data stream is reconstructed by combining the flows, using the flow IDs and

timestamps in the packet headers.

For reliable transport protocols (e.g., TCP), a packet arriving early will

be stored in the resequencing buffer temporarily, waiting for all the packets with

a smaller sequence number to arrive. On the other hand, for most of the realtime

applications, received frames are decoded and displayed continuously, which enforces

a deadline for every packet. A packet will be extracted from the playout buffer and

decoded even if one or more packets with a smaller sequence number are missing.

In this case, the decoder may apply error control (e.g., FEC and MDC) and error

concealment (e.g., copy-from-the-previous-frame) to reduce the damage caused by

the lost packets. There is a rich literature on the resequencing delay analysis, e.g.,

see [80] and [81]. Previous work shows that both the resequencing delay and buffer

requirements are moderate if the traffic allocator is adaptive to the path conditions

inferred from the QoS feedbacks. In Chapter 3, we will present the analysis on

optimal traffic partitioning and on how to minimize the resequencing delay using

network calculus.

4.3.3 Packet Formats

MRTP/MRTCP uses three types of packets for data and control, namely, MRTP

data packets, MRTP QoS report packets, and MRTP session/flow control packets.

It also provides the flexibility of defining new extension headers or profiles for future

multimedia applications.

MRTP Data Packet

The format of a MRTP data packet is illustrated in Fig.4.5. The header fields are:

Page 116: C:/Documents and Settings/Shiwen Mao/My Documents/mydoc ...szm0001/ShiwenMao_dissertation.pdf · viii previous work, our scheme is more realistic, and is easier to implement. Depending

99

Pad. (2) Ext. (1) Mk (1) Payload Type (8) Source ID (8) Destination ID (8)

Timestamp (32)

......Payload

Payload Padding

Ver. (4)

Flow ID (16)Session ID (16)

Flow Sequence Number (32)

Figure 4.5: The MRTP data packet format.

• Version: 4 bits, version of MRTP/MRTCP.

• Padding: 2 bits, padding bytes are put at the end to align the packet boundary

to a 32-bit word. There could be 0 to 3 padding bytes.

• Extension: 1 bit, if set, the fixed header is followed by one or more extension

headers, with a format defined later.

• Marker: 1 bit, it is used to allow significant events such as frame boundaries to

be marked in the packet stream.

• Payload Type: 8 bits, it carries a value indicating the format of the payload.

The values are defined in a companion profile, which is compatible with RTP.

Note that this field is one bit longer than that in RTP. This bit can be used to

define new extension headers which are not defined in RTP.

• Source ID: 8 bits, it is the ID of the sender of this packet. It is randomly

generated at the session set up time.

• Destination ID: 8 bits, it is the ID of the receiver of this packet. It is randomly

generated at the session set up time.

• Session ID: 16 bits, it is the randomly generated ID of the MRTP session, which

is carried by all the packets belonging to this session.

Page 117: C:/Documents and Settings/Shiwen Mao/My Documents/mydoc ...szm0001/ShiwenMao_dissertation.pdf · viii previous work, our scheme is more realistic, and is easier to implement. Depending

100

• Flow ID: 16 bits, it is the ID of the flow, randomly generated at the session set

up time, or when a new flow joins the session.

• Flow Sequence Number: 32 bits, it is the sequence number of this packet in the

flow it belongs to. The initial flow sequence number is randomly generated in

the session set up time or when the flow joins the session.

• Timestamp: 32 bits, it reflects the sampling instance of the first byte in the

payload.

• Payload: the multimedia data carried in the packet.

• Padding: 0 to 3 bytes, it is used to align the packet boundary with a 32-bit

word.

MRTCP QoS Reports

Sender Report (SR) is the QoS report sent by a sender. In the scenarios shown in

Fig.4.3 and Fig.4.12(a), the sender(s) and receiver(s) are fixed as long as they stay

in the MRTP session. In the scenario shown in Fig.4.12(b), a node that transmits

realtime data during the last reporting period is a sender. Otherwise it is a receiver.

The MRTP SR format is shown in Fig.4.6. The MRTP Reiver Report (RR)

has a similar format, with the Total Packet Count and the Total Octet Count fields

removed. Some of the header fields are the same as that of MRTP. The new fields

are explained as follows:

• This Flow ID: 16 bits. The ID of the flow where this report is sent on.

• Length: 16 bits. The total length in bytes of this report.

• NTP Timestamp: 64 bits. It indicates the wallclock time when this report was

sent so that it may be used in combination with timestamps returned in RRs

from the receivers to measure the round-trip time (RTT).

Page 118: C:/Documents and Settings/Shiwen Mao/My Documents/mydoc ...szm0001/ShiwenMao_dissertation.pdf · viii previous work, our scheme is more realistic, and is easier to implement. Depending

101

Ver. (4) Payload Type (8) Source ID (8) Destination ID (8)

Profile−specific extensions

NTP Timestamp, most significant word (32)

NTP Timestamp, least significant word (32)

MRTP Timestamp (32)

Flow ID (16) Source ID (8) Destination ID (8)

Total packet count (32)

Total octet count (32)

Fraction lost in flow (8) Cumulative number of packet lost in flow (24)

Interarrival jitter (32)

Last Report Received (LR) (32)

Delay Since Last Report (DLR) (32)

Highest sequence number received (32)

This flow ID (16) Length (16)

Number of flows (16)Session ID (16)

....

Rved(2) Rved(1)Ext.(1)

Figure 4.6: The MRTP Sender Report format.

• MRTP Timestamp: 32 bits. This is the MRTP timestamp corresponding to

the same time as the NTP timestamp. It may be used in synchronization, in

estimating the MRTP clock frequency, and in RTT estimation.

• Total Packet Count: 32 bits. This is the total number of MRTP data packets

sent on this flow. This can be used by a receiver to compute the loss ratio of

the flow.

• Total Octet Count: 32 bits. This is the total number of data bytes sent on this

flow.

• Fractional Loss in Flow: 8 bits. It is the data packet loss fraction of this flow

in the last report period.

• Cumulative Number of Packets Lost in Flow: 24 bits. It is the total number of

data packets lost in this flow.

Page 119: C:/Documents and Settings/Shiwen Mao/My Documents/mydoc ...szm0001/ShiwenMao_dissertation.pdf · viii previous work, our scheme is more realistic, and is easier to implement. Depending

102

• Interarrival Jitter: 32 bits. It is the interarrival jitter of this flow, calculated

using the same algorithm as RTP (Appendix A.8 of [14]).

• Highest Sequence Number Received: 32 bits. This is the highest sequence

number received in this flow.

• Last Report Received: 32 bits. We use the middle 32 bits out of 64 in the NTP

timestamp of the most recent MRTCP report received from this flow.

• Delay Since Last Report Received: 32 bits. This is the delay, expressed in units

of 1/65536 seconds, between receiving the last SR or RR from this flow and

sending this report.

• Profile-Specific Extensions: extension of this format can be defined in a future

profile.

Note that to increase the reliability of feedbacks, RRs or SRs may be sent

on the best path or sent on multiple paths. In the latter case, the MRTP Timestamp

field is used to screen old or duplicated reports. With MRTP, both the sender and

the receiver estimate the RTT. The estimated RTT can be used to adapt to the

transmission conditions, or to set the retransmission timers for session/flow control

messages.

MRTCP Session/Flow Control Messages

MRTCP control messages include the messages used to set up a MRTP session, to

manage the set of flows in use, and to describe a participant of the MRTP session.

1) Session Control Messages: The Hello Session message is sent by either a

sender or a receiver (called the initiator) to initiate a MRTP session. The message

format is shown in Fig.4.7. Following the MRTP common header, there is a randomly

generated session ID and the total number of flows proposed to be used in this session.

Next is a number of flow maps, each associating a flow ID to the corresponding

source/destination sockets. A randomly generated initial sequence number for this

flow follows the flow map.

Page 120: C:/Documents and Settings/Shiwen Mao/My Documents/mydoc ...szm0001/ShiwenMao_dissertation.pdf · viii previous work, our scheme is more realistic, and is easier to implement. Depending

103

Ver. (4) Destination ID (8)Source ID (8)

Profile−specific extensions

....

Flow ID (16) Source ID (8) Destination ID (8)

Number of flows (16)

Payload Type (8)

Session ID (16)

Source IP address (32)

Destination IP address (32)

Source port number (16) Destination port number (16)

Initial flow sequence number (32)

Rved(2) Rved(1)Ext.(1)

Figure 4.7: The MRTP Hello Session message format.

An ACK Hello Session message is sent to acknowledge the reception of a

Hello Session message or another ACK Hello Session message. Its format is similar

to the Hello Session message format, but with two differences: (1) the Payload Type

field has a different value; and (2) The Initial Flow Sequence Number field is replaced

by a Flow Status field. A value of SUCCESS for this field indicates the proposed flow

has been confirmed by the remote node, while a value of FAIL indicates the flow is

denied to be used. The values of these macros are defined in the MRTP profiles.

MRTP Bye Session and ACK Bye Session messages are used to terminate

a MRTP session. The Bye Session message format is given in Fig.4.8. The ACK

Bye Session message format is similar to Fig.4.8, but with an additional 32-bit Status

field. A value of SUCCESS means the MRTP session is successfully terminated and

the resources allocated to this session are all released, while a FAIL (or a timeout)

means something is wrong in terminating the MRTP session. In this case, the node

may retransmit the Bye Session messasge until a maximum number of limit is reached.

Then the node abort the MRTP session alone.

2) Flow Control Messages: During the transmission, some flows may be

broken or congested, and some new flows may be found by the underlying multi-

path routing protocol. Flow control messages, namely, Add/Delete Flow and ACK

Add/Delete Flow, are used to add or remove flows from a MRTP session. The format

of the Add Flow (ACK Add Flow) messages is similar to the Hello Session (ACK

Page 121: C:/Documents and Settings/Shiwen Mao/My Documents/mydoc ...szm0001/ShiwenMao_dissertation.pdf · viii previous work, our scheme is more realistic, and is easier to implement. Depending

104

Ver. (4) Destination ID (8)Source ID (8)

Profile−specific extensions

Number of flows (16)

Payload Type (8)

Session ID (16)

Rved(2) Rved(1)Ext.(1)

Figure 4.8: The MRTP Bye Session message format.

Hello Session) format, with different Payload Type values. The Delete Flow (ACK

Delete Flow) message format is similar to that of the Hello Session (ACK Hello Ses-

sion) message, but with different Payload Type values and without the Initial Flow

Sequence Number field.

3) Participant Descriptions: As in RTP, MRTP uses Source Description to

describe the source and CNAME to identify an end point. In MRTP, each participant

is also identified by a unique ID, e.g., a source ID or a destination ID. The IDs can be

randomly assigned, or be calculated from the CNAME, e.g., using a hash function.

Extension Headers

MRTP uses extension headers to support additional functions not supported by the

current headers. The common extension header format is given in Fig.4.9. The Type

field is defined by MRTP profiles. The MRTP extension header has a variable length,

indicated by the length field. Several MRTP extension headers are given below:

Ext. (1)

Extension header specific data (variable length)

Ver (4) Rvd (2) Rvd(1) Type (8) Length (16)

Figure 4.9: The MRTP extension header format.

• Source routing extension header: Since multiple paths are used in MRTP, source

routing can be used to explicitly specify the route for each packet. However, if

the lower layers do not support source routing, application-level source routing

Page 122: C:/Documents and Settings/Shiwen Mao/My Documents/mydoc ...szm0001/ShiwenMao_dissertation.pdf · viii previous work, our scheme is more realistic, and is easier to implement. Depending

105

can be implemented by defining a source routing extension header carrying the

route for the packet.

• Authentication extension header: This header provides a simple authentication

mechanism using an ID field and a Password field encrypted with application

specific encryption schemes. This extension header can be used in a session/flow

control packets to validate the operations requested, or in a RR or SR to validate

the QoS report.

• Striping extension header: This extension header should have fields carrying

striping related parameters, such as S or B (see Section 4.2.1). A user can use

this extension header to inform each server which blocks of the video clip it

wishes to download from the server.

We borrow the daisy-chain headers idea from IPv6 [83]. More specifically,

there is a one-bit Extension field in the MRTP/MRTCP common header and all the

extension headers. If this bit is set to 1, there is another header following the current

header. This provides flexibility to combine different extension headers for a specific

application. As illustrated in Fig.4.10, a MRTP data packet has a common header

with Ext=1, a source routing extension header with Ext=1, and an authentication

extension header with Ext=0, followed by multimedia data payload. Thus this MRTP

data packet is authenticated and uses source routing to get to its destination.

extension headerSource routing

(ext = 1)

Authentication extension header

(ext = 0)

MRTP data packetcommon header

(ext = 1)Data

Figure 4.10: A daisy-chain of MRTP extension headers.

MRTP Profiles

As RTP, MRTP needs additional profile specifications to define payload type codes

and their mapping to payload formats, and payload format specifications to define

Page 123: C:/Documents and Settings/Shiwen Mao/My Documents/mydoc ...szm0001/ShiwenMao_dissertation.pdf · viii previous work, our scheme is more realistic, and is easier to implement. Depending

106

how to carry a specific encoding. These profiles and specifications are compatible

with those of RTP. New multimedia services can be easily supported by defining new

MRTP profiles and specifications.

4.3.4 The Operations of MRTP/MRTCP

Figure 4.11 illustrates the operation of a MRTP session, which we will discuss in the

following.

Hello Session

ACK Hello Session

Sender

ACK Hello SessionData

Data...

Receiver Report

Data...

Sender Report

Add Flow

ACK Add FlowData

Bye Session

ACK Bye Session

Receiver

Figure 4.11: The operation of MRTP/MRTCP.

Connection Establishment and Termination

MRTP is a connection-oriented protocol in the sense that a MRTP session needs to be

set up before data transfer begins. Either the sender or the receiver (or the initiator)

can initiate a MRTP with a three-way handshake, using the Hello Session and the

ACK Hello Session messages. The three-way handshake gives both ends a chance to

choose which flows to use and to resolve possible collisions in randomly generated

Page 124: C:/Documents and Settings/Shiwen Mao/My Documents/mydoc ...szm0001/ShiwenMao_dissertation.pdf · viii previous work, our scheme is more realistic, and is easier to implement. Depending

107

session/flow IDs.

For example, if SCTP is used and both nodes are multihomed, the initiator

first chooses a port number for each of its interfaces to be used. Then it sends a

Hello Session message to the remote host. The IP addresses of the remote node are

resolved by Domain Name System (DNS) queries. The initiator may not know the

destination port numbers, so it just puts all zeros in these fields. When the remote

node receives this Hello Session message, it can choose which flows proposed (in the

flow mapping fields of the Hello Session message) to use. In the ACK Hello Session

message it sends to the initiator, it fills in a port number for each selected flows. The

status of a selected socket is set to SUCCESS, while the status of a denied flow is

set to FAIL. The initiator further acknowledges this ACK Hello Session message with

another ACK Hello Session message. Then data transmission begins on the confirmed

flows. With MRTP, multihomed hosts are not necessary. When source routing is used,

the initiator first queries its routing table to obtain multiple maximally node-disjoint

routes to the destination. Then it sends a Hello Session message with the source

routing extension header. Thus in addition to the flow mappings in the Hello Session

header, the exact routes to the destination are also carried in the source routing

extension header to negotiate a MRTP session.

When the transmission is over (or during the transmission), a participating

node may decide to leave the session. This is done in different ways for different

usage scenarios. For the point-to-point multimedia streaming case shown in Fig.4.3,

either of the nodes can send a MRTP Bye Session message to terminate the MRTP

session. After the remote node responding with an ACK Bye Session message, the

session is terminated. For a many-to-one type application shown in Fig.4.12(a), the

receiver may send a Bye Session message to all the servers, and get acknowledged, to

terminate the session. However, a participating server can only send a Delete Flow

message if it wishes to leave the MRTP session. For the multicast application shown

in Fig.4.12(b), a node will only leave the session silently.

MRTP uses retransmission timers for the control messages to cope with the

unreliable service provided by UDP and IP. If there is no response received when the

timer expires, the control message is retransmitted. The timeout value is determined

Page 125: C:/Documents and Settings/Shiwen Mao/My Documents/mydoc ...szm0001/ShiwenMao_dissertation.pdf · viii previous work, our scheme is more realistic, and is easier to implement. Depending

108

by RTT estimation. A measured RTT sample is calculated using the timestamp, LR,

and DLR fields in the received RR or SR, as is done in RTP [14]. Then the measured

value is smoothed and the timer value is updated using the algorithm used in TCP

[47]. The maximum number of retransmissions allowed is set by the application.

Flow Management

During a MRTP session, some flows may be unavailable. For example, an intermediate

node may be crushed or congested. In ad hoc networks, an intermediate node may

leave the network or run out of battery. In these cases (which can be identified by

examining MRTCP reports), either the sender or the receiver can delete the broken

flow from the MRTP session by issuing a Delete Flow message carrying the ID of

the flow. When a new path is found, a new flow can be added to the association by

sending an Add Flow message. These mechanisms enable MRTP to quickly react to

topology changes and congestions in the network.

Data Transfer

When the MRTP session is established, MRTP packets carrying multimedia data is

transmitted on the multiple flows associated with the session. Each packet carries a

sequence number that is local to its flow and a timestamp that is used by the receiver

to synchronize the flows.

The core implementation of MRTP does not guarantee the reliable delivery

of application realtime data. As RTP, MRTP relies on lower layers for timely delivery

and other QoS guarantees. However, MRTP is flexible in supporting various error

control schemes. For example, redundancy can be introduced at the traffic allocator

when assigning the packets to the flows, or in a multistream video encoder when

compressing the video stream. Both open-loop error control schemes (e.g., Forward

Error Correction (FEC) and MDC [79]) and closed-loop error control schemes (e.g.,

Automatic Repeat Request (ARQ) [78]) can be implemented above MRTP for better

error resilience.

Page 126: C:/Documents and Settings/Shiwen Mao/My Documents/mydoc ...szm0001/ShiwenMao_dissertation.pdf · viii previous work, our scheme is more realistic, and is easier to implement. Depending

109

QoS Reports

During the MRTP session, the receiver keeps monitoring the QoS statistics of all the

flows, such as the accumulated packet loss, the highest sequence number received,

and jitter. These statistics are put in a compound report packet and returned to

the sender. The reports can be sent on a single flow, e.g., the best flow in terms of

bandwidth, delay, or loss probability, or some (or all) of the flows for better reliability.

The frequency at which the reports are sent is set by the application. A participant

in the session keeps a record of the MRTP timestamp of the last received report from

each report sender. When it receives multiple reports from the same sender, it checks

the MRTP timestamp in the reports and discards the duplicated and obsolete ones.

4.3.5 Usage Scenarios

Unicast Video Streaming

This is a point-to-point scenario as illustrated in Fig.4.3. Consider a wireless sensor

network deployed to monitor, e.g., wild life, in a remote region. Some sensors carry a

video camera, and others are simple relays that relay the captured video to the base.

Source routing, or other simple routing protocols can be used for finding multiple

disjoint paths.

A camera sensor initiates a MRTP session to the base. The captured video is

transmitted using multiple flows going through different relays. In this way, a relative

high rate video can be scattered to multiple paths each being bandwidth-limited.

Redundancy can also be introduced by transmitting a more important substream

using multiple flows.

Some sensors may be damaged or may run out of power. In this case, the

underlying multipath routing protocol informs MRTP about the path changes. Either

the sender or the base can delete a failed flow, or add a fresh flow to the session. The

server at the base maintains a resequencing buffer for each flow, as well as enforcing

a deadline for each packet expected to arrive.

Page 127: C:/Documents and Settings/Shiwen Mao/My Documents/mydoc ...szm0001/ShiwenMao_dissertation.pdf · viii previous work, our scheme is more realistic, and is easier to implement. Depending

110

Parallel Video Downloading

This is a many-to-one scenario, as illustrated in Fig.4.12(a). Consider an ad hoc

network, where each node maintains a cache for recently downloaded files. When a

node A wants to download a movie, it would be more efficient to search the caches of

other nearby nodes first than going directly to a remote server in the Internet. If the

movie is found in the caches of nodes B, C, and D, A can initiate a MRTP session

to these nodes, downloading a piece of the movie simultaneously from each of them.

The striping extension header is used in each flow indicating the striping parameters.

There are three flows, each with a unique flow ID, in this MRTP session. However,

the flows have the same session ID since they belong to the same MRTP session. The

resequencing buffers are used in A to reorder the packets, using the sequence numbers

and the timestamps in the headers.

AC

B

D

flow 2

flow 1

flow 3

Network

A

B

C

D

flow 1

flow 2

Network

(a) (b)

Figure 4.12: Another two usage scenarios of MRTP, in addition to the one in Fig. 4.3.

During the transmission, Node D moves out of the network. Node A would

delete the flow from D and adjust the striping paramters used in the other two flows.

Now the part of the video initially chosen from D will be downloaded from B and C

instead, by sending Add Flow packets to B and C with updated striping parameters.

On the other hand, node A may broadcast probes periodically to find new neighbors

Page 128: C:/Documents and Settings/Shiwen Mao/My Documents/mydoc ...szm0001/ShiwenMao_dissertation.pdf · viii previous work, our scheme is more realistic, and is easier to implement. Depending

111

with the movie and replace a stale flow. MRTP provides the flexibility for applications

to implement these schemes.

Combined with multistream video coding schemes, e.g., layered coding with

unequal protection of the base layer [8] or multiple description coding [79], error

resilience can be greatly improved. In these schemes, the video encoders and traffic

allocators can adapt to transmission errors and topology changes inferred from the

QoS reports. A similar application of video streaming in Content Delivery Networks

(CDN) using multiple servers is presented in [39]. Also this application can be used

in P2P networks for a client to download a video clip from multiple peers for a lower

delay and better error resilience.

Multimedia Multicasting

This is a multicast application similar to a video teleconferencing using RTP. Unlike

RTP, MRTP uses multiple multicast trees, as illustrated in Fig.4.12(b). In [84],

algorithms are given for computing two maximally disjoint multicast trees, where one

is used for multicasting and the other is maintained as a backup. This algorithm can

be adapted to support multiple multicast trees in a MRTP session.

For the example shown in Fig.4.12(b), there are two multicast trees used,

each being a flow and both belonging to the same MRTP session. Since there may

be a large number of participants in the session, QoS feedback should be suppressed,

as in RTP, to avoid feedback explosion. The RTCP transmission interval comput-

ing algorithm [14] can be used to dynamically compute the interval between two

back-to-back QoS reports using the current number of participants and the available

bandwidth as inputs. Since a flow may have more than one receiver, the sender ID

field in a RR packet is used to identify its sender. The QoS metrics for a flow are

computed over all the receivers of this flow at the sender side.

Page 129: C:/Documents and Settings/Shiwen Mao/My Documents/mydoc ...szm0001/ShiwenMao_dissertation.pdf · viii previous work, our scheme is more realistic, and is easier to implement. Depending

112

4.4 MRTP Performance Studies

In this section, we present two performance studies of the proposed protocol. We

first investigate the impact of traffic partitioning on the queueing performance of the

flows. Then, we present the simulation studies of MRTP using OPNET models, and

compare MRTP with RTP in a wireless mobile ad hoc network setting to demonstrate

its advantages. We will examine optimal traffic partitioning and load balancing in

Chapter 3.

4.4.1 The Impact of Traffic Partitioning

Consider a bottleneck node in the network. There are N flows, belonging to different

MRTP sessions, traversing this node, as illustrated in Fig.4.13(a). Assume the flows

are independent and homogeneous. The output buffer of the bottleneck node can

be viewed as a multiplexer of N i.i.d. flows, as shown in Fig.4.13(b). Also assume

the video streams are thinned with thinning parameter S and B = 1 frame (see

section 4.2.1). In the following, we present the impact of thinning on the queueing

performance of the flows in this bottleneck node.

::

::

....

....

bottlenecknode

C

B

Session1, flow1

Session2, flow2

SessionN, flowN

......

(a) (b)

Figure 4.13: The performance analysis model for section 4.4.1.

Previous work on large deviation techniques using the Bahadur-Rao asymp-

totics shows that the buffer overflow probability of a queue fed by N homogeneous

Page 130: C:/Documents and Settings/Shiwen Mao/My Documents/mydoc ...szm0001/ShiwenMao_dissertation.pdf · viii previous work, our scheme is more realistic, and is easier to implement. Depending

113

sources and with total buffer size B and service capacity C is:

Ψ(c, b, N) ≈ exp[−NI(c, b) + g(c, b, N)], (4.3)

I(c, b) = infm≥1

[b+m(c− µ)]2

2V (m), (4.4)

g(c, b, N) ≈ −1

2log[4πNI(c, b))], (4.5)

where c = C/N and b = B/N [85]. V (m) is the variance of a single source with a

temporal aggregation level m. That is, given the original sequence {Xi, i = 1, 2, · · ·},we can derive a new sequence {Yk =

∑(k+1)m−1i=km , k = 1, 2, · · ·} with aggregation level

m, and V (m) = var{Yk}.It is well-known that video traffic is LRD. For a second-order self-similar

source, either exactly or asymptotically, with a Hurst parameter H , the variance

V (m) can be approximated as V (m) ≈ σ2m2H for large m. Plugging V (m) into (4.4),

we have:

I(c, b) =(c− µ)2Hb2−2H

2σ2κ2(H), (4.6)

where κ(H) = HH(1 −H)1−H . For the thinned video stream, we have from [73]:

V (S,m) ≈ S−2σ2m2H , for large m. (4.7)

Since the mean rate of each thinned flow is about µ = µ/S, for a fair comparison,

we also scale down the service rate of the original system to C = C/S. Thus we

can compare two systems, one is a queue with a service rate C = Nc and fed by

N original video streams, and the other is a queue with a service rate C = Nc and

fed by N thinned video flows, under the same load ρ = (Nµ)/C = (Nµ)/C = µ/c.

Plugging V (S,m), µ, and c into (4.4), we have:

I(c, b, S) =(c− µ)2H(Sb)2−2H

2σ2κ2(H). (4.8)

Let us also define the measure Γ of improvement in the buffer overflow

probability of the queueing system fed with thinned video flows, as compared to that

of the queueing system fed by the original video streams under the same load ρ = µ/c,

Page 131: C:/Documents and Settings/Shiwen Mao/My Documents/mydoc ...szm0001/ShiwenMao_dissertation.pdf · viii previous work, our scheme is more realistic, and is easier to implement. Depending

114

as:

Γ(c, b, N, S)def=

Ψ(c, b, N)

Ψ(c, b, N, S)(4.9)

= S1−H exp{NI(c, b)(S2−2H − 1)}. (4.10)

Note that Γ = 1 when S = 1 and Γ is an increasing function of S.

It has been shown in [73] that although thinning does not reduce the long

term correlation of realtime traffic (hence the self-similarity of a thinned stream is still

the same as the original stream), it effectively reduces the short term correlation in

the traffic. To illustrate this effect, we generate a self-similar traffic trace that is the

aggregate of 128 identical on-off sources with Pareto distributed on and off periods1.

An on-off source generates one unit of data per time slot when it is on, and is idle

in the off state. The generated self-similar trace has a mean of 64, a variance of 32,

and a Hurst parameter of 0.8. We used a trace of 10, 000, 000 samples for the results

reported in the following.

In Fig. 4.14, the V (S,m) computed from the original synthesized data and

the thinned data when S = 2 and 4 are plotted, respectively. Note that all the three

curves increase linearly with m in the log-log plot, indicating that both the original

trace and the thinned traces are self-similar. It can be easily varified that the slopes

of the three curves are all equal to 2H . Therefore, the longterm correlation in the

trace is not affected by the thinning process. However, the amplitude of V (S,m) is

effectively reduced by thinning. Larger reductions in V (S,m) can be obtained by

increasing S. The analytical results using (4.7) are also plotted in Fig.4.14, which

tightly match the simulation results. Fig.4.15 illustrates the impact of S on V (S,m)

for a fixed temporal aggregation level m = 500. It can be seen that larger reductions

on V (S,m) can be achieved as S increases. Also note that the analytical results

match the simulation results closely.

The reduced V (S,m) results in improved queueing performances. Fig-

ure 4.16 plots the buffer overflow probability of a queue fed by 100 thinned sources

1Note that we used the synthesized traffic trace instead of recorded video traces, since the lengthof a video trace may be not long enough to get stable results, especially when the trace is furtherthinned. With synthesized data, we can get a realization that is sufficiently long for our experiment.

Page 132: C:/Documents and Settings/Shiwen Mao/My Documents/mydoc ...szm0001/ShiwenMao_dissertation.pdf · viii previous work, our scheme is more realistic, and is easier to implement. Depending

115

1e+02

1e+03

1e+04

1e+05

1e+06

1e+07

10 100 1000

Var

ianc

e (V

(S,m

))

Aggregate level (m)

S=1 (sim)S=1 (ana)S=2 (sim)S=2 (ana)S=4 (sim)S=4 (ana)

Figure 4.14: Variance V (S,m) with different aggregation level m.

1e+03

1e+04

1e+05

1e+06

1 2 3 4 5 6 7 8 9 10

Var

ianc

e (V

(S,m

))

Thinning Parameter (S)

simulationanalysis

Figure 4.15: Variance V (S,m) with different number of flows S.

Page 133: C:/Documents and Settings/Shiwen Mao/My Documents/mydoc ...szm0001/ShiwenMao_dissertation.pdf · viii previous work, our scheme is more realistic, and is easier to implement. Depending

116

with Hurst parameter H = 0.88, σ2 = 1, c = 1, and µ = 0.9, when 1, 2, and 4 flows

are used, respectively. The system loads of the three curves are the same. It can be

seen that with thinning, the buffer overflow probability is greatly reduced and it is

lower when more flows are used. The improved buffer overflow probability translates

to a smaller delay, a lower packet lost rate, and a smaller jitter.

To further illustrate the impact of the number of flows used in a MRTP

session on the improvement achieved in the buffer overflow probabilty, we plot Γ

(defined in (4.9)) in Fig. 4.17 when S increases from 1 to 10. It can be seen that Γ is

1 when a single flow is used and increases with S. From the queueing performance’s

point of view, additional flows are always helpful. However, maintaining a large

number of flows requires a higher session/flow control overhead, as well as additional

cost in multipath routing. Even worse, as the number of flows increases, it is more

difficult to maintain a larger number of disjoint paths. If the paths are not disjoint,

the losses of the flows will be more correlated. Correlation among the flows makes the

video streams dependent, harming the assumption of our analysis. As a results, we

would expect only a few flows to be used for a MRTP session. The number of flows

reflects a tradeoff between the overhead incurred, the correlation among the flows,

and the improvement in queueing performance. Fig. 4.17 also suggests that a larger

improvement can be achieved for larger per-flow buffer assignments.

4.4.2 Video Transport over Ad Hoc Networks

In this section, we present the MRTP performance study using OPNET models [11].

We simulated an ad hoc network in a square region, while each node is randomly

placed in the region initially. The popular Random Waypoint mobility model is used

[53], with a constant nodal speed and a constant pause time of 1 second. We used

the IEEE 802.11 protocol in the MAC layer working in the DCF mode. The channel

bandwidth is 1Mbps and the transmission range is 250 meters.

We used the Quarter Common Intermediate Format (QCIF, 176×144 Y

pixels/frame, 88×72 Cb/Cr pixels/frame) sequence “Foreman” (the first 200 frames

from the original 30 fps sequence) encoded at 10 fps. The MDMC video codec was

Page 134: C:/Documents and Settings/Shiwen Mao/My Documents/mydoc ...szm0001/ShiwenMao_dissertation.pdf · viii previous work, our scheme is more realistic, and is easier to implement. Depending

117

1e-07

1e-06

1e-05

1e-04

1e-03

1e-02

1e-01

0 100 200 300 400 500 600 700 800 900 1000

Buf

fer

Ove

rflo

w P

roba

bilit

y

Buffer Size (b)

Original video (S=1)Thinned w/o Compression (S=2)Thinned w/o Compression (S=4)

Figure 4.16: Buffer overflow probability of a queue fed by 100 video flows, with S = 1,

2, and 4, repectively.

1

10

100

1000

1 2 3 4 5 6 7 8 9 10

BO

P Im

prov

emen

t Rat

io

The Thinning Parameter (S)

b=400b=200b=100

Figure 4.17: Performance improvement ratio Γ as a function of the thinning parameter

S for different buffer sizes.

Page 135: C:/Documents and Settings/Shiwen Mao/My Documents/mydoc ...szm0001/ShiwenMao_dissertation.pdf · viii previous work, our scheme is more realistic, and is easier to implement. Depending

118

used [79], generating two video flows (each is called a description) each with a bit

rate of 59Kbps. In MDMC, 5% macroblock level intra-refreshments are used, which

has been found to be effective in suppressing error propagation for the range of the

packet loss rates considered. Each group of blocks (GOB) is packetized into a single

packet, to make it independently decodable.

Among the nodes, one is randomly chosen as the video source and another

as the video receiver, where a 5 second playout buffer is used to absorb the jitter in

received packets. The MRTP session uses two routes. The Multipath Dynamic Source

Routing (MDSR) model [8] (which is a multipath extension of the Dynamic Source

Routing (DSR) protocol [29]) is used to maintain two maximally node-disjoint routes

for the MRTP session. All other nodes generate background traffic for a randomly

chosen destination. The inter-arrival time of the background packets is exponentially

distributed with a mean of 0.2 second. The background packets have a constant

length of 512 bits.

Figure 4.18 plots the resequencing buffer occupancies at the receiver for a

simulation of 16 nodes in a 600 × 600 region. It can be seen that the buffer occupancies

are relatively independent to each other (except around the 30s, 60s, and 250s time

instances): when the workload in one resequencing buffer decreases due to congestion

or a broken link on that path, usually the workload of the other buffer does not.

This demonstrates the benefit of path diversity, which provides an effective means for

error control. Fig. 4.18 implies that for the two descriptions sent on different node-

disjoint paths, it is very rarely that both descriptions are lost. With the MDMC

codec, a frame is predicted from two previous frames [79]. Combined with MRTP,

the MDMC decoder can not only produce a tolerable display quality from a single

description received, but also interpolate the lost description from the received one,

resulting in a improved video quality. Note that at about the 30th, the 60th, and the

250th seconds, both buffers drop simultaneously, which implies that the two flows are

closely correlated or both paths are congested or down at these time instances. When

more than two paths are used, such events will occur less oftenly.

Next we compare the MRTP performance with that of RTP, using the same

MDMC codec. For the RTP simulations, we used the NIST DSR routing model [51]

Page 136: C:/Documents and Settings/Shiwen Mao/My Documents/mydoc ...szm0001/ShiwenMao_dissertation.pdf · viii previous work, our scheme is more realistic, and is easier to implement. Depending

119

0

50

100

150

200

250

300

350

0 50 100 150 200 250 300 350

buffe

r (p

acke

t)

time (second)

Playout buffer 0Playout buffer 1

Figure 4.18: The occupancies of the resequencing buffers of two flows at the receiver

node.

that maintains a single path to a destination. The two descriptions are interleaved

and transmitted on this path. For the MRTP simulations, we used the MDSR routing

model [8] to obtain two maximally node-disjoint routes to the video receiver. Each

description was assigned to a different path.

The PSNR traces from two of the simulations with 32 nodes in a 800 × 800

region are plotted in Figure 4.19 and Figure 4.20, respectively. Figure 4.19 is from a

simulation in which all the nodes are stationary. It can be seen that the PSNRs for the

first 20 frames are relatively low for both MRTP and RTP simulations. This is because

there was a relatively large amount of routing traffic during the initial period of the

simulations2. The routing traffic caused congestion in the network and degraded the

video quality. After the first 20 frames (about 2 seconds), most of the routing tables

are populated and the routes converge. The PSNRs from the MRTP simulation

increases and gets stable after this initialization period, but the PSNRs from the

RTP simulation still suffers large and frequent fluctuations. Without mobility, the

2Originally the route caches in all the nodes are empty, and the nodes are performing routediscovery.

Page 137: C:/Documents and Settings/Shiwen Mao/My Documents/mydoc ...szm0001/ShiwenMao_dissertation.pdf · viii previous work, our scheme is more realistic, and is easier to implement. Depending

120

routes from the source to the destination are relatively constant. However, there is

still packet loss due to congestion. When a node fails to access the wireless channel

after several retries (7 in our simulations), it drops the packet being forwarded and

concludes that the link is broken, even though the wireless link may still exist. The

source will be notifed about this “false” link failure and initiate a new route discovery

process. With MRTP, the load of the network is more balanced than the RTP case,

resulting in fewer congestion events. In the RTP simulation, the video session suffers

from congestion. In addition, more routing traffic is generated for route discovery

when congestion occurs, which further intensifies the congestion along the route.

In Fig.4.20, all nodes moved at a constant speed of 6m/s. With mobility,

a link breaks when two nodes move away from each other, resulting in packet losses

and an additional rerouting delay. When mobility is supported, both MRTP and

RTP have poorer performance as compared with Fig.4.19. Since MRTP uses two

maximally node-disjoint routes, it is less likely that both descriptions are lost at the

same time. Thus even with frequent link failures, the MDMC/MRTP system can still

maintain a good video quality for the most of the simulation period. Clearly MRTP

outperforms RTP in this case.

4.5 Summary

In this chapter, we proposed a new protocol, named MRTP/MRTCP, for realtime

multimedia transport over ad hoc networks using multiple flows. Our proposal was

motivated by the observations that (1) path diversity is effective in combating trans-

mission errors in the networks, and (2) data partitioning techniques are effective in

improving the queueing performance of realtime traffic. The new protocol is an ex-

tension of RTP/RTCP, exploiting multiple paths existing in mesh networks. The new

protocol is complementary to SCTP for realtime multimedia transport.

We also presented two performance studies of the MRTP/MRTCP proto-

col. The first study focused on the impact of traffic partitioning on the queueing

performance of realtime flows in a bottleneck router. Applying the Bahadur-Rao

asymptotics, we showed that with traffic partitioning, the buffer overflow probabil-

Page 138: C:/Documents and Settings/Shiwen Mao/My Documents/mydoc ...szm0001/ShiwenMao_dissertation.pdf · viii previous work, our scheme is more realistic, and is easier to implement. Depending

121

0

5

10

15

20

25

30

35

40

45

50

0 50 100 150 200 250 300 350

PS

NR

(dB

)

Frame Number

MRTP:MDMC,32nodes,0m/sRTP:MDMC,32nodes,0m/s

Figure 4.19: Comparison of MRTP and RTP: all nodes are stationary during the

simulation.

0

5

10

15

20

25

30

35

40

45

50

0 50 100 150 200 250 300 350

PS

NR

(dB

)

Frame Number

MRTP:MDMC,32nodes,6m/sRTP:MDMC,32nodes,6m/s

Figure 4.20: Comparison of MRTP and RTP: all nodes move at a speed of 6m/s.

Page 139: C:/Documents and Settings/Shiwen Mao/My Documents/mydoc ...szm0001/ShiwenMao_dissertation.pdf · viii previous work, our scheme is more realistic, and is easier to implement. Depending

122

ity of the partitioned flows are greatly reduced as compared with a system where

the original flows are multiplexed under the same system load. The second study

focused on video transport over ad hoc networks. Through OPNET simulations of

a 16-node and a 32-node ad hoc network, we showed that MRTP/MRTCP achieves

great improvement on received video quality over RTP/RTCP.

Page 140: C:/Documents and Settings/Shiwen Mao/My Documents/mydoc ...szm0001/ShiwenMao_dissertation.pdf · viii previous work, our scheme is more realistic, and is easier to implement. Depending

123

Chapter 5

Analyzing a Generalized Processor Sharing

System

5.1 Introduction

In the previous chapters, we investigated video transport using the end-to-end ap-

proach, where the source node and the destination node perform transport controls,

e.g., error control, QoS feedback, traffic partitioning, and flow or session control, over

a best-effort and stateless IP network. In this chapter, we discuss network support

for multimedia transport.

Multimedia traffic, such as image, voice, and video, has become a significant

portion of today’s Internet traffic. Multimedia applications can have very diverse QoS

requirements. For example, some applications demand reliable and timely delivery

of information. Such services require no packet loss and no delay beyond a fixed

deadline. QoS guarantees of this type is commonly referred as deterministic, or hard,

guarantees. On the other hand, by applying error concealment, most multimedia

services can tolerate a certain degree of packet loss and delay violations. For these

services, hard guarantees are not necessary. By allowing a small violation probability,

significant statistical multiplexing gain can be achieved [143]. This type of QoS

guarantee is commonly referred to as a statistical, or soft, QoS guarantee. One

major concern in the design, implementation, and management of the Internet is

how to provide QoS guarantees for applications with diverse QoS requirements, while

achieving high utilization of network resources.

Page 141: C:/Documents and Settings/Shiwen Mao/My Documents/mydoc ...szm0001/ShiwenMao_dissertation.pdf · viii previous work, our scheme is more realistic, and is easier to implement. Depending

124

The task of QoS provisioning is jointly accomplished by the user and the

network, as illustrated in Figure 5.1. When the user starts a new session, it first

provides a description of the traffic that will be generated and its desired QoS. The

traffic could be described using either statistical or bounding traffic models. The QoS

requirements could be soft or hard, while the QoS metrics could be loss, delay, jitter,

throughput, etc. The network will accept or reject this user session according to its

current state, i.e., if there are enough resources (buffer, bandwidth, etc.) to satisfy

the requested QoS. Once the session is admitted, the network should guarantee its

QoS by providing a certain amount of resources, while not degrading other existing

users’ QoS. The user session should be monitored and policed during the session’s

lifetime, to make it conform to the traffic description. The network should also have

mechanisms of isolating the users in some sense in case that a nonconforming user

violates the agreed upon traffic specification, and degrades other users’ QoS.

QoS guarantees

Admit or Reject

NetworkRequest

Traffic specificationincluding QoS requirements

Policer

A(t) A’( t)

User

Figure 5.1: A network access point, where admission control is performed and the

user traffic A(t) is policed to conform to the traffic specification.

QoS guarantees could be per-flow based or class based. With per-flow based

QoS guarantees, traffic generated by an application keeps its identity while traversing

the network [94]. The network needs to keep the states of all the active flows. This

may be very difficult to achieve in a core router with a huge number of active flows.

On the other hand, traffic can be marked at the boundary of the network to be

Page 142: C:/Documents and Settings/Shiwen Mao/My Documents/mydoc ...szm0001/ShiwenMao_dissertation.pdf · viii previous work, our scheme is more realistic, and is easier to implement. Depending

125

classified into a small number of QoS classes. The network provides QoS guarantees

for each class, each being the aggregate of a number of flows with the same QoS

requirement [95].

To provide QoS guarantees in a resource constrained network, users should

be adaptive and resilient to traffic loss. It is also desirable to have a number of QoS

levels each with a different resource requirement. The network architecture capable

of QoS provisioning should contain at least the following key components [92][93]:

• Traffic specification: This is part of the interface between applications and the

network where characteristics of the source traffic and desired QoS are specified.

• Routing: Provides route(s) between source and destination(s) that have suffi-

cient resources to support the desired QoS.

• Resource Reservation: Reserves resources such as bandwidth and buffer in the

network switches or routers that are necessary for QoS guarantees.

• Admission Control: Decides whether the request by an application for setting

up a session should be accepted or rejected based on its QoS requirement and

the current state of the network.

• Packet Scheduling: Schedules packets to be forwarded from a switch or router’s

incoming ports to the outgoing ports according to the QoS requirements of the

sessions sharing the switch or router.

Admission control and packet scheduling are arguably the most important

components in the new network architecture capable of QoS provisioning. As dis-

cussed earlier in this section, admission control is used to prevent the network from

being overly congested. A desirable admission control scheme should achieve high

utilization of the network resources, but with a moderate complexity in implementa-

tion and enforcement. Packet scheduling is used in switches and routers to enforce

service differentiation. Once admitted, a user session may be assigned with a cer-

tain priority. In each intermediate router or switch, packet scheduling guarantees a

consistent treatment of the packets in that priority level.

Page 143: C:/Documents and Settings/Shiwen Mao/My Documents/mydoc ...szm0001/ShiwenMao_dissertation.pdf · viii previous work, our scheme is more realistic, and is easier to implement. Depending

126

Among the many scheduling disciplines, the Generalized Processor Sharing

(GPS) policy has been most widely studied. GPS is a work conserving discipline

in which N input classes share a deterministic server with a total capacity c. With

GPS, each class is associated with a weight and is guaranteed a minimum service

rate in proportion to its weight whenever it is backlogged. Furthermore, the residual

service of the non-backlogged classes are distributed to the backlogged classes in

proportion to their weights. Therefore, GPS is efficient in utilizing and sharing the

server capacity (since it is work conserving and the bandwidth is shared by all classes),

while being capble of isolating the classes (since each class is guaranteed a minimum

rate, it won’t be affected by a misbehaving class). By assigning different weights to

the classes, service differentiation can be easily achieved. GPS assume the traffic is

infinitely divisible, i.e., a fluid traffic model is used in defining the discipline. There

are many packet scheduling algorithms used in high-speed routers and switches that

aiming to approximate the GPS scheme.

GPS has been widely studied [96]-[101]. Most previous work took the bound-

ing approach and focused on general arrival processes, with deterministic or stochas-

tic settings. GPS systems are studied with various source characterizations, such

as Poisson with symmetric service sharing [96], leaky bucket regulated sources [97],

exponential bounded burstiness sources [98], and heavy-tailed sources [99]. One most

widely used technique in GPS analysis is the feasible ordering or feasible partition-

ing. With this technique, a GPS system can be transformed to a priority system

from which performance bounds can be derived [97][98]. These results are gener-

ally expected to be loose since the finer dynamics of the sources are not exploited

[100]. It is well known that data traffic is self-similar since the seminal work [76].

The self-similar, or heavy-tailed, network traffic poses a great challenge to network

control and QoS provisioning. On the other hand, Markovian processes have been

widely used in modeling multimedia traffic, e.g., the Voice over IP (VoIP) traffic.

Previous work shows that Long Range Dependent traffic, such as VBR video, can be

adequately approximated by Short Range Dependent traffic models for traffic engi-

neering purposes [102][103][104]. This makes it possible to investigate certain aspects

of the impact on the performance of the long-range correlation structure within the

Page 144: C:/Documents and Settings/Shiwen Mao/My Documents/mydoc ...szm0001/ShiwenMao_dissertation.pdf · viii previous work, our scheme is more realistic, and is easier to implement. Depending

127

confines of traditional Markovian analysis [105].

In this chapter we present the analysis of a multi-class GPS system, where

each class is modeled as a Markov Modulated Fluid Process (MMFP). A MMFP

can be defined by a state space S, a rate vector Λ, and a generator matrix M.

The generator governs the state transitions, while in state i, 1 ≤ i ≤ |S|, the source

generates fluid traffic at a rate of Λ(i). In order to analyze a GPS system with MMFP

sources, two fundamental questions should be answered:

1. How to de-couple the GPS service (since the amount of service a class receives

depends not only on its own queue occupancy, but also on the backlogs of all

other classes)?

2. How to design a scalable algorithm to handle a large number of sources (since

with MMFP sources, the state space scales up exponentially as the number of

the classes increases)?

In the following, we first apply the queue decomposition technique intro-

duced in previous work [100][101] to decouple the GPS system. Then based on a

GPS service bound derived from the decomposed system (denoted as the LZT bound)

[100], we further transform the decomposed system into a queueing system with a

deterministic service. From this transformed model, we derive the effective band-

width of a MMFP class, which is the network resource rquired to satisfies the QoS

requirement of that class [108][112]. This is a very powerful technique in network

analysis. As a result, we design an admission control test based on analysis. Because

of its simplicity, this algorithm can be easily implemented and enforced in routers

and switches. Since the correlations among the classes is taken into account, this

algorithm also achieves high bandwidth utilization, compared with schemes based on

segregated service.

Furthermore, we propose a tighter service bound based on the same decou-

pled GPS system [101] (denoted as the LMP bound). This new service bound captures

the dynamics of GPS service sharing, and achieves a more accurate analysis as com-

pared with the LZT bound. We compare the LMP bound with the LZT bound, and

Page 145: C:/Documents and Settings/Shiwen Mao/My Documents/mydoc ...szm0001/ShiwenMao_dissertation.pdf · viii previous work, our scheme is more realistic, and is easier to implement. Depending

128

present its application in admission control, to illustrated the improvement it achieves

in bandwidth utilization over the LZT bound.

Finally, we examine the computational aspect of the algorithms. We ex-

tended the previous work on the Matrix Analytic Method for stochastic fluid flows

[120] to GPS analysis using the LMP bound [124]. This Matrix Analytic Method

based scheme avoids the eigenvalue and eigenvector computation used in traditional

fluid queue analysis [121], by using matrix additions, mulplications, and inversions.

We also analyze the caudal characteristics of the classes, which further illustrate the

benefits of using GPS for QoS differentiation.

The rest of the chapter is organized as follows: the system model and the

problem statement are presented in Section 5.2. In Section 5.3, we review a few

analysis techniques presented in previous works that will be used in our analysis. In

Section 5.4, we presents the derivation of the effective bandwidth of the classes with

a GPS server, and an admission control scheme. In Section 5.5, we present the LMP

bound, which results in a more accurate analysis of the GPS queue distributions. In

Section 5.6, we present our extension of the previous work on fluid queue analysis

using the Matrix Analytical Method [120].

5.2 The System Model and Problem Statement

GPS is a work conserving scheduling discipline in which N traffic classes share a

deterministic server with capacity c [97]-[101]. Associated with the sessions is a set

of parameters {ωi, i = 1, . . ., N}, called the GPS weights. For 1 ≤ i ≤ N , class

i is guaranteed a minimum service rate gi = ωi∑N

k=1ωk

c whenever it is backlogged.

In addition, if the set of backlogged classes during the time interval [τ, t) is B(t) ⊆{1, · · · , N}, and the amount of service a class i receives in the interval is Si(τ, t), then

Si(τ, t)

Sj(τ, t)≥ ωi

ωj

, j = 1, 2, · · · , N, (5.1)

holds for any class i, i ∈ B(t). Without loss of generalization, we assume∑N

k=1 ωk = 1

in the rest of this chapter for an easy exposition.

Page 146: C:/Documents and Settings/Shiwen Mao/My Documents/mydoc ...szm0001/ShiwenMao_dissertation.pdf · viii previous work, our scheme is more realistic, and is easier to implement. Depending

129

The GPS system model with a GPS server and N classes is illustrated in

Fig.5.2. Each class i, with instant rate ri(t), is modeled as a MMFP with the state

space Si, the rate vector Λi, and the infinitesimal generator Mi. The generator matrix

Mi governs the state transitions. When class i is in state si at time t, it generates

fluid traffic at a rate of ri(t) = Λi[si]. Let λi be the average rate of class i. We assume∑Ni=1 λi ≤ c, which guarantees the ergodicity of the system. We also assume λi < gi,

i ∈ {1, . . ., N}, and will loosen this assumption later. There is an infinite buffer and

each class has its own logical queue with occupancy Xi(t).

2ω1ωr1(t)

r (t)2C

......

r

...

X (t)(t)

X2(t)

X1(t)

N N

Figure 5.2: The system model

We are interested in the tail distribution of class i’s queue occupancy, i.e.,

Fi(x)def= Pr(Xi ≥ x), which upper bounds the loss class i experiences in a finite

buffer system and can also be used to bound the delay distribution. When the tail

distributions are derived, an admission control scheme can be designed for the GPS

system.

5.3 Preliminaries

5.3.1 A Fluid Queue with MMFP Sources

Consider a simple fluid queue with a constant service rate c. The input process is a

MMFP source with the state space S, the rate vector Λ, and the generator matrix M.

Let X and S denote the random variables for the backlog and the state of the input

process, respectively. Then define Fi(x) as the stationary cumulative distribution

Page 147: C:/Documents and Settings/Shiwen Mao/My Documents/mydoc ...szm0001/ShiwenMao_dissertation.pdf · viii previous work, our scheme is more realistic, and is easier to implement. Depending

130

function of X in state i, i.e.,

Fi(x) = Pr{S = i, X ≤ x}, i ∈ S, 0 ≤ x <∞, (5.2)

and the stationary distribution vector F(x) = [F1(x), F2(x), · · · , F|S|(x)]t. By setting

up the system of forward transition equations from time t to t + δt, and taking the

limits δt → 0 and t → ∞, respectively, a system of ordinary differential equations

(ODE) can be derived as:

DdF(x)

dx= MF(x), 0 ≤ x ≤ ∞, (5.3)

where D = diag{Λ} − cI and I is the identity matrix of proper dimension.

This ODE system can be easily solved by first deriving the solution of the

generalized eigenvalue problem [121]:

Dzφ = Mφ. (5.4)

If the eigenpairs derived are {zi, φi}, i ∈ {1, · · · , |S|}, the stationary queue occupancy

distribution has the following spectral representation [121]:

F(x) =∑

i

αiezixφi. (5.5)

The coefficients {αi}, i ∈ {1, · · · , |S|}, can be determined by the following boundary

conditions:

1. If the system is stable, then F(x) is finite and bounded by 1. Therefore, the

coefficients corresponding to the positive zi’s must be 0.

2. As x → ∞, all terms corresponding to the negative zi’s disappear and the

remaining constant associated with the zero eigenvalue is set equal to the sta-

tionary probability of the global source state (denoted as π).

3. For an overload state i, i.e., Λ(i) > c, the buffer could not be empty. In other

words, Fi(0) = 0, if Λ(i) > c.

Page 148: C:/Documents and Settings/Shiwen Mao/My Documents/mydoc ...szm0001/ShiwenMao_dissertation.pdf · viii previous work, our scheme is more realistic, and is easier to implement. Depending

131

The structure of the ODE system, i.e., D is a diagonal matrix with both positive

and negative diagonal elements and M is a stochastic matrix, guarantees that there is

always a zero eigenvalue, and the number of overload states is equal to the number of

negtive eigenvalues [121]. Therefore there is enough boundary conditions to determine

the coefficients.

5.3.2 Output Characterization

In [106], the output process of a constant service queue with a MMFP source is

derived. We briefly review the procedure in this section. This technique is used in

the LZT bound discussed in the next section.

Consider the same model as that in the previous section. The output process

of the fluid queue can be approximated by a MMFP source, defined by the three-tuple

{So,Λo,Mo}, by lumping the busy period states into a single state sb with a rate c.

Let Su denote the set of the underload states, i.e., Λ(i) < c for all i ∈ Su. Then the

new state space is So = Su ∪ sb. The generator matrix and the rate vector of the

output process are:

Λo = [Λu, c] , (5.6)

Mo =

Muu a

b −1b

, (5.7)

where the submatrix Muu and the rate vector Λu correspond to the underload states

of the input process. The elements in a, b, and b are computed as follows:

ai =∑j∈So

Mij , i ∈ Su, (5.8)

bi =qi

b, i ∈ Su, (5.9)

q =1

hπu(0)Muu, (5.10)

b = −1

h(1 − 〈πu(0), 1〉), (5.11)

where h is a normalization constant, and πu(0) is the stationary probability vector

of empty queue in state s ∈ Su (see (5.5) in the previous section).

Page 149: C:/Documents and Settings/Shiwen Mao/My Documents/mydoc ...szm0001/ShiwenMao_dissertation.pdf · viii previous work, our scheme is more realistic, and is easier to implement. Depending

132

It was shown in [106] that with this characterization, all the stationary

moments of the actual output process are identical to those of the approximating

process {So,Λo,Mo}.

5.3.3 The LZT Service Bound

In [100], an approximate lower bound for the service that a class receives in a multi-

class GPS system is presented. We denote this service bound as the LZT bound,

which de-couples the correlated GPS service. It takes two steps to obtain the LZT

bound for a tagged class i. First, the departure process of each class j, roj (t), j �= i, is

obtained while assuming the service rate is gj. The technique in Section 5.3.2 is used

to characterize the departure processes. Next, for a tagged class i, its service rate is

lower bounded by:

si(t)def= gi +

∑j �=i

ωi∑k �=j ωk

(gj − roj (t)) (5.12)

The modulated service si, which is also a MMFP, consists of the guaranteed service

rate gi and the residual service rate seen by class i, while assuming all other classes

are busy. The state space of si(t) is S = Si⋃

j �=i Soj , and the generator is M =

Mi⊕Mo1⊕· · ·⊕Mo

N . Applying this service bound, an upper bound on the the queue

length distribution of class i can be derived from a queueing system with the class i

input and a MMFP service, as illustrated in Fig.5.3.

������������X i(t)ri (t)

is (t)^

Figure 5.3: The LZT bound model

The tightness of the LZT bound is determined by: (1) how accurately the

departure processes are modeled, and (2) the number of classes in the system. In

[106], it is argued that the output characterization is accurate when the non-empty

buffer probability is lower than 10−3, which was not atypical for multimedia traffic

with a stringent delay requirement. Moreover, in the context of Diffserv [110], the

Page 150: C:/Documents and Settings/Shiwen Mao/My Documents/mydoc ...szm0001/ShiwenMao_dissertation.pdf · viii previous work, our scheme is more realistic, and is easier to implement. Depending

133

flows are aggregated and the number of classes is usually small. In such systems, we

believe the LZT bound can be reasonably tight for the cases of interest.

To apply this bound, the output characterization of the classes should be

computed first using the guaranteed rates, gi, i = 1, · · · , N . As presented in Sec-

tion 5.3.2, output characterization requires a stable system, i.e., λi < gi, i ∈ {1, . . .,

N}. It is possible that this condition is violated for a given set of weights. However,

if the GPS system is stable, we can compute the output process of the class that does

not satisfy the above condition using g∗idef= λi∑N

j=1λj

.

5.3.4 The Chernoff-Dominant Eigenvalue Approximation

As shown in Section 5.3.1, the stationary queue occupancy distribution of a fluid queue

has the spectral representation, which requires determining all the eigenpairs of the

ODE system (5.3). However, when the buffer size is large, most of the terms in (5.5)

goes to zero, while only the term associated with the largest eigenvalue dominates. In

order to speed up the computation when applying the analysis to admission control,

the dominant term can be used to approximate (5.5). Now the question raised is how

to find the dominant eigenvalue without solving the original generalized eigenvalue

problem.

A Chernoff-Dominant Eigenvalue approximation for the fluid queue system

was presented in [108], providing an efficient means of computing the dominant eigen-

value. Let the system haveN MMFP input sources, and each source i have rate matrix

Λi and generator Mi, i = 1, . . ., N ; and the aggregated traffic be characterized by

rate matrix Λ and generator M. To obtain the system’s dominant eigenvalue, the

following inverse eigenvalue problem is solved:

cφ = (Λ− 1

zM)φ (5.13)

= (Λ1 − 1

zM1) ⊕ · · · ⊕ (ΛN − 1

zMN)φ. (5.14)

Since both Λ and M can be decomposed into the Kronecker sums of the rate matrices

and generators of the individual classes, (5.13) has the Kronecker property1. The

1That is, the global matrix can be decomposed into a Kronecker sum of the matrices of the

Page 151: C:/Documents and Settings/Shiwen Mao/My Documents/mydoc ...szm0001/ShiwenMao_dissertation.pdf · viii previous work, our scheme is more realistic, and is easier to implement. Depending

134

dominant eigenvalue zd of the original system is derived by solving:

N∑i=1

mi(zd) = c, (5.15)

where mi(z) is the maximum real eigenvalue (MRE) of the i-th inverse eigenvalue

sub-problem, i.e., mi(z) = MRE(Λi − 1zMi). When the buffer is large, the tail

distribution can be approximated by:

Pr(X ≥ x) ∼= G(x)def= Lezdx, (5.16)

where L is the loss in a bufferless multiplexing system, which compensates the statis-

tical gain of multiplexing a large number of sources. L is computed using the Chernoff

bound. Interested reader can refer to [106][107] for the algorithm on computing L.

5.4 The Effective Bandwidth of a MMFP Class

5.4.1 Transforming the Decoupled System

The exact analysis of the system in Fig.5.3 is difficult since the dimension of the state

space is large. For example, the generator of the aggregated traffic ofN heterogeneous

on-off sources is of the order 2N × 2N . In the case of a large number of states, the

huge matrices of the system are too unwieldy to handle and the numerical problems

encountered often make it unsolvable. Hence it is desirable to further decompose the

de-coupled system. In this section, we will derive an equivalent system based on the

LZT bound that is much easier to solve.

Without loss of generality, assume:∑N

i=1 ωi = 1. Define ωi/∑

k �=j ωkdef= σi

j ,

then (5.12) becomes:

si(t) = gi +∑j �=i

σijgj −

∑j �=i

σijr

oj (t).

Note that the first two terms on the right hand side are state independent. Applying

this service bound, the dynamics of logical queue i is:

d

dtXi(t) = ri(t) − si(t)

classes.

Page 152: C:/Documents and Settings/Shiwen Mao/My Documents/mydoc ...szm0001/ShiwenMao_dissertation.pdf · viii previous work, our scheme is more realistic, and is easier to implement. Depending

135

= ri(t) +∑j �=i

σijr

oj (t) − (gi +

∑j �=i

σijgj).

Define sources: r∗i (t) = ri(t), r∗j (t) = σi

jroj (t) for all j �= i. For a new class r∗j (t), j �= i,

the rate vector is Λ∗j = σi

jΛoj , and the generator matrix is M∗

j = Moj . Also define the

deterministic service rate c∗i = gi +∑

j �=i σijgj. Then:

d

dtXi(t) =

N∑k=1

r∗k(t) − c∗i , (5.17)

which is identical to the system equation of a deterministic fluid queue with service

rate c∗i , and with the scaled new classes r∗k(t), k = 1, . . ., N as input, as illustrated

in Fig.5.4.

c*(t)iX (t)i

r*(t)1

r*(t)2

r*(t)N������������

������������

...

Figure 5.4: An equivalent model that in Fig. 5.3.

This derived system has the Kronecker property since the system matrices

can be decomposed to the Kronecker sum of that of the new classes. Therefore, the

Chernoff-Dominant Eigenvalue approximation in [108] can be applied to this system

and the tail distribution of the tagged class i is of the form (5.16).

5.4.2 The Effective Bandwith of a MMFP Class

With the transform in the previous section, and as an extension of [108], we define

the effective bandwidth of MMFP class i as follows:

Theorem 5.1: Let the tagged class i have a loss requirement p associated with a

buffer assignment B. The Chernoff bound prefactor is L and let η = log(p/L)/B. If

the equivalent system in Fig.5.4 has a rate matrix Λ∗ and a generator M∗, then, the

effective bandwidth of class i, cei , is:

ceidef= MRE(Λ∗ − 1

ηM∗) (5.18)

Page 153: C:/Documents and Settings/Shiwen Mao/My Documents/mydoc ...szm0001/ShiwenMao_dissertation.pdf · viii previous work, our scheme is more realistic, and is easier to implement. Depending

136

Proof : According to (5.13), (5.14), and (5.15), the MRE of Λ∗− 1ηM∗ is the capacity

required to get a dominant eigenvalue η. Then, applying the Chernoff-Dominant

Eigenvalue approximation in (5.16) results in a loss rate of p when the buffer size is

B. Thus cei is the bandwidth required to achieve a loss rate of p associated with a

buffer size of B for class i, in other words, the effective bandwidth of class i.

Intuitively, cei is the minimum bandwidth required by class i to achieve a

decay rate of η in its tail distribution curve, or a loss rate of p given the buffer assign-

ment B. Therefore, given a class’s QoS requirement on loss or delay distribution, we

can determine the system resource required to satisfy it. This can be applied in Call

Admission control test. Equation (5.18) is tight in the sense that it takes into account

the correlation among the classes and the service sharing dynamics of the GPS server.

However, the effective bandwidth of class i defined in (5.18) is more complex than

that in [108]. In the latter case, cei is a function of the source parameters and its QoS

requirements only, i.e., it is independent on other classes within the system. GPS

provides the ability to differentiate the classes, but also introduces correlation among

the classes. In (5.18), cei not only depends on class i itself, but also depends on all

other classes in the system.

Equation (5.17) requires much less computational effort than solving the

system in Fig.5.3 directly. In both cases the output processes for the classes should

be obtained first. The saving is from the dominant eigenvalue and Chernoff bound

computation, because the matrices are much smaller when applying (5.17). For ex-

ample, if all classes are on-off, the modulated service in (5.12) has 2N−1 states. So

the complexity of getting the dominant eigenvalue is of O(2N) for Fig.5.3. For the

equivalent system, the Chernoff-Dominant Eigenvalue approximation in Section 5.3.4

can be applied. The dominant eigenvalue zd is found by numerically solving (5.15)

iteratively, resulting in a complexity of O(N).

An admission control test based on the above analysis is given in Table 5.1.

With this algorithm, the cei ’s of the classes are recomputed when a new source be-

longing to a class arrives2. Then the new cei ’s are compared with the corresponding

2Note that the outpur characterization, which has high computational complexity, is only per-formed for the class to which the new source belongs.

Page 154: C:/Documents and Settings/Shiwen Mao/My Documents/mydoc ...szm0001/ShiwenMao_dissertation.pdf · viii previous work, our scheme is more realistic, and is easier to implement. Depending

137

service rates in the transformed systems (see Figure 5.4). If all the cei ’s are less than

the corresponding c∗i ’s, the new source is accepted. Otherwise, the system can not

guarantee the required QoS, and the new source is rejected.

Table 5.1: An Admission Control Test Based on (5.18).

for i=1 to N do

Compute cei and c∗i ;

if cei > c∗i then output NO and stop;

endfor;

output YES and stop.

5.4.3 Numerical Investigations

Consider a three-class GPS system with service rate c. The source parameters are

given in Table 5.2, where α and β are the transition rates from off to on, and on to

off, respectively; λ is the input rate when the source is on. Table 5.3 gives the service

rate c and the GPS weights used in Figures 5.5, 5.6, and 5.7, where ωi is the GPS

weight of class i and load = (λ1+λ2+λ3)/c. K is the number of sources in each class.

These bursty sources are also used in [106], in modeling VoIP for traffic engineering

purposes [101][109]. To give a more practical interpretation to the results, we may

equate the server capacity to the standard OC-3 rate of 155 Mbp/s. By choosing a

time unit of 0.5 msecs, one unit of fluid then corresponds to 10 Kbp/s, and B = 10

is equivalent to a 100 Kb buffer.

Fig.5.5-5.7 are the analytical results of the tail distributions of the classes,

compared with fluid simulations. We increase the number of sources in each class from

10 to 100, and increase the load from 61.8% to 77.7%. Observe that the analytical

results are conservative except for the very small buffer region. The analysis becomes

more conservative as load increases. Table 5.4 gives the numerical values of the decay

rates of the tails obtained by simulation and analysis in Fig.5.5. They are very close

to each other, implying that this technique is asymptotically accurate. We also give

Page 155: C:/Documents and Settings/Shiwen Mao/My Documents/mydoc ...szm0001/ShiwenMao_dissertation.pdf · viii previous work, our scheme is more realistic, and is easier to implement. Depending

138

Table 5.2: Source Parameters used in Figures 5.5, 5.6, and 5.7

- α β λ

class 1 0.4 1.0 1.00

class 2 0.4 1.0 1.20

class 3 1.0 1.0 0.61

Table 5.3: GPS Weights of the Classes in Figures 5.5, 5.6, and 5.7

- c ω1 ω2 ω3 K load

Fig. 5.5 15.1 0.33 0.40 0.27 10 61.8%

Fig. 5.6 41.4 0.32 0.38 0.30 30 67.7%

Fig. 5.7 120.1 0.33 0.40 0.27 100 77.7%

the slopes of class 3 in Table 5.5. It is shown that the analysis decay rates are very

close to the simulations in all the cases.

Table 5.4: Slopes of G(x) in Fig.5.5

- Simulation Analysis

Class 1 -0.8936 -0.7900

Class 2 -0.6839 -0.6479

Class 3 -1.9886 -1.9183

Although the analytical decay rates match the simulations very well, there

is still some gap between the analytical and simulated tail curves. This is caused

by the approximation error in output characterization and the LZT service bound.

An accurate output characterization requires a low buffer non-empty probability, i.e.,

low traffic load. However, even for a load as high as 77.7%, the tail distributions

are accurate enough for engineering purposes. Better techniques, which can improve

the pre-factor L or use multiple-term approximation of the tails [111], can make the

Page 156: C:/Documents and Settings/Shiwen Mao/My Documents/mydoc ...szm0001/ShiwenMao_dissertation.pdf · viii previous work, our scheme is more realistic, and is easier to implement. Depending

139

1e-08

1e-07

1e-06

1e-05

1e-04

1e-03

1e-02

1e-01

1e+00

0 1 2 3 4 5

Tai

l Dis

trib

utio

n

Buffer Size

class 1(analysis)95% conf. Int.

class 2(analysis)95% conf. Int.

class 3(analysis)95% cof. Int.

Figure 5.5: Tail distributions of the three classes, each with 10 on-off sources and

c = 15.1.

results more accurate.

For the tail distribution of class 1 in Fig.5.5, both the rate and generator

matrix used for solving the system in Fig.5.3 directly are of size 616×616. It takes sev-

eral minutes to compute the dominant eigenvalue using a computer with a Pentium c©

III 450MHz CPU. With our technique, we process three much smaller matrices, which

are of sizes 11×11, 7×7, and 8×8, respectively. Consequently, it takes far less time

to get the tails. Table 5.5 gives the time in seconds used to compute the tail distri-

Table 5.5: Slope of G(x) for Class 3 and the Time Used to Computed the Tails in

Figures 5.5, 5.6, and 5.7.

- Fig. 5.5 Fig. 5.6 Fig. 5.7

Analysis -1.918 -1.622 -0.625

Simulation -1.988 -1.916 -0.681

Time (seconds) 0.691 2.093 8.157

Page 157: C:/Documents and Settings/Shiwen Mao/My Documents/mydoc ...szm0001/ShiwenMao_dissertation.pdf · viii previous work, our scheme is more realistic, and is easier to implement. Depending

140

1e-08

1e-07

1e-06

1e-05

1e-04

1e-03

1e-02

1e-01

1e+00

0 1 2 3 4 5

Tai

l Dis

trib

utio

n

Buffer Size

class 1(analysis)95% conf. Int.

class 2(analysis)95% conf. Int.

class 3(analysis)95% cof. Int.

Figure 5.6: Tail distributions of the three classes, each with 30 on-off sources and

c = 41.4.

butions of class 3 for larger systems. Although the system consists as many as 300

sources, it took the algorithm reasonable time to finish the computation.

5.5 A Tighter Service Bound

5.5.1 The LMP Bound

As the numerical results in the previous section show, the LZT bound and our analysis

can accurately capture the asymptotic decay rates of the tail distributions when the

number of classes is small. However, as the number of classes increases, the LZT

bound is expected to be loose, since it only captures the GPS service dynamics where

a single queue is empty. In this section, we present a lower bound for the service that

a class receives in the GPS system, denoted as the LMP bound, which results in a

more accurate analysis than the LZT bound.

As illustrated in Fig. 5.8, it takes two steps to derive the LMP bound for the

tagged class i. First, the departure process of each class, roj (t), j �= i, is approximated

Page 158: C:/Documents and Settings/Shiwen Mao/My Documents/mydoc ...szm0001/ShiwenMao_dissertation.pdf · viii previous work, our scheme is more realistic, and is easier to implement. Depending

141

1e-08

1e-07

1e-06

1e-05

1e-04

1e-03

1e-02

1e-01

1e+00

0 1 2 3 4 5

Tai

l Dis

trib

utio

n

Buffer Size

class 1(analysis)95% conf. Int.

class 2(analysis)95% conf. Int.

class 3(analysis)95% cof. Int.

Figure 5.7: Tail distributions of the three classes, each with 100 on-off sources and

c = 120.1.

assuming the service rate be gj. The generator matrix of roj (t) is Mo

j . The state space

of the departure process, Soj , consists of several underload states and one overload

state with a rate of gj, if gj is less than the peak rate; otherwise the departure process

is identical to rj(t). Secondly, in the decomposed system, let the set of backlogged

classes in the decomposed system be B(t) at time t, and the set of the non-backlogged

classes in the decomposed system be B(t) at time t. Then, class i’s service rate, si(t),

is the guaranteed service rate gi increased by its share of the residual service rate from

all the non-backlogged classes determined by the GPS weights of all the backlogged

classes, as shown below.

si(t)def= gi +

ωi

ωi +∑

k∈B(t) ωk

∑j∈B(t)

(gj − roj (t)). (5.19)

Logical queue i’s occupancy distribution can be derived by applying this service bound

in a system with an input ri(t) and a modulated service si(t) (both are MMFPs).

Theorem 5.2: Let the service class i receives in the original GPS system be si(t).

Page 159: C:/Documents and Settings/Shiwen Mao/My Documents/mydoc ...szm0001/ShiwenMao_dissertation.pdf · viii previous work, our scheme is more realistic, and is easier to implement. Depending

142

S

Queue Decomposition

Queue Isolation

r1(t) r (t)Nr1(t)

r (t)2

r (t)N

r (t)i

i (t)~

r (t)i

c

GPS Control

r (t) r (t)i−1 i+1

i−1 i+11 Ng g g g

Figure 5.8: The queue decomposition technique.

Then the following holds for all i ∈ {1, · · · , N} and t,

si(t) ≤ si(t), and (5.20)

si(t) ≥ si(t). (5.21)

Proof : Let Br(t) and Br(t) be the backlogged set and the non-backlogged set in the

original GPS system, respectively.

(a) If the tagged class i is non-backlogged, then in all the three systems, i.e., the

original GPS system, the equivalent system using the LZT bound, and the equivalent

system using the LMP bound in Fig. 5.8, the service rates class i receives are equal

to its input rate. In other words, si(t) = si(t) and si(t) = si(t).

(b) If the tagged class i is backlogged, consider a decomposed queue j in Fig. 5.8. Its

service rate is the guaranteed minimum service rate gj and its input process is rj(t),

which is the same as that in the original GPS system. Therefore, the backlog in the

decomposed queue is always longer or equal to the logical queue in the original GPS

system. That is, B(t) ⊇ Br(t) and B(t) ⊆ Br(t). Therefore, the summation in (5.19)

is a lower bound of that in the original GPS system, and the denominator in (5.19)

is an upper bound of that in the original GPS system. Therefore, si(t) ≤ si(t).

(c) When a decomposed queue j in Fig. 5.8 is backlogged, its departure rate is

gj. Therefore,∑

j �=i(gj − roj (t)) =

∑j∈B(t)(gj − ro

j (t)). In addition,∑

k �=j ωk ≥ ωi +∑k∈B(t) ωk. These give si(t) ≥ si(t).

Page 160: C:/Documents and Settings/Shiwen Mao/My Documents/mydoc ...szm0001/ShiwenMao_dissertation.pdf · viii previous work, our scheme is more realistic, and is easier to implement. Depending

143

5.5.2 Numerical Investigations

Consider a 6-queue GPS system, where each class has an on-off source. The source

parameters are given in Table 5.6. The service rate is 3.01. In Fig. 5.9 we compare

the LMP bound with the LZT bound, as well as with the simulation results. It

can be observed that the tail distributions derived by both bounds upper bound the

simulation results. With this 6-queue GPS system, the LZT bound becomes more

loose, while the LMP bound is still quite accurate.

Table 5.6: Source Parameters Used in Fig. 5.9

- α β λ ω

class 1 to class 5 0.25 1.0 0.75 1/6

class 6 2.33 1.0 1.07 1/6

1e+00

1e-01

1e-02

1e-03

1e-04

1e-05

1e-06

1e-07

1e-08

1e-09

1e-10 0 2 4 6 8 10

Ove

rflo

w p

roba

bilit

y of

the

queu

es

Buffer size (X 10K Bytes)

LMP analysis, class 6LZT analysis, class 6

LMP analysis, class 1LZT analysis, class 1

Simulation, class 695% confidence interval

Simulation, class 195% confidence interval

Figure 5.9: Comparative results of the LMP bound, the LZT bound, and simulations

in the case of 6 classes. The tail distributions of the logical queues are plotted.

Next, we focus on bandwidth sharing using GPS in admission control, to

Page 161: C:/Documents and Settings/Shiwen Mao/My Documents/mydoc ...szm0001/ShiwenMao_dissertation.pdf · viii previous work, our scheme is more realistic, and is easier to implement. Depending

144

quantify the improvement in the number of admissible sources when bandwidth is

shared among multiple classes using GPS. The three types of sources used in the

experiment are given in Table 5.7, while the service rate is 40.1. The average active

and silent states of the voice and compressed voice sources are drawn from ITU-T

standard models [113] modified to include the UDP, IP, and RTP headers for VoIP. For

the FTP source, we matched the moments of the FTP sources from [114]. Although

we admittedly do not try to scale our MMFP sources to capture the exact long term

correlations of data traffic, we are able to derive some general trends with a simple

parameter matching.

Table 5.7: Source Parameters Used in Fig. 5.10 and Fig. 5.11

- α β λ loss rate

Voice 0.6313 1.0 5.54 1e-2

Compressed Voice 0.6313 1.0 1.0 1e-3

FTP Data 0.0267 1.0 127.2 1e-5

In Fig. 5.10, we plot the admissible region for the three classes using a seg-

regated bandwidth allocation, i.e., each class has a reserved bandwidth and there is

no server sharing. The improvement in the number of class 3 sources using the LMP

bound over the segregated system is plotted in Fig. 5.11. The gain corresponds to

as many as 10 voice, 10 compressed voice, and 25 FTP sources. However, the gains

achieved by the LMP bound comes at the cost of higher computational complex-

ity over the LZT bound. The state reduction techniques introduced in [115] may,

however, be applied to reduce the computational complexity.

5.6 Matrix Analytic Methods for GPS Analysis

In [120], Ramaswami applied matrix analytic methods to fluid flow analysis. By

appealing to the skip-free nature of fluid level fluctuations and time reversibility

theory, the computation of the steady state distribution of a Markov fluid FIFO

Page 162: C:/Documents and Settings/Shiwen Mao/My Documents/mydoc ...szm0001/ShiwenMao_dissertation.pdf · viii previous work, our scheme is more realistic, and is easier to implement. Depending

145

Figure 5.10: The admissible region using a segregated bandwidth allocation.

queue is reduced to the analysis of a discrete time, discrete state space quasi-birth-

death (QBD) model. The QBD models are well studied and many computational

algorithms are available in the literature. An excellent survey can be found in [118].

In this section, we first review the main results in [120], and then we present three

useful corollaries.

5.6.1 Matrix Analytical Methods for Fluid Flow Analysis

Suppose the fluid flow source is governed by a continuous time, irreducible Markov

process with state space {1, · · ·, m, m+1, · · ·, m+n} and infinitesimal generator M.

The net rate of input to the infinite buffer is assumed to be di > 0 when the Markov

chain is in state i, i ≤ m, and dj < 0 when the system is in state j, j > m.

Let π= [π1,π2] be the steady state probability vector of the process, while

π1 is of order m × 1 and π1 of order n × 1. Define ∆ = diag(π), M = ∆−1Mt∆,

and D = diag(d1, d2, · · · , dm+n). Also let S = D−1M and partition S into four

Page 163: C:/Documents and Settings/Shiwen Mao/My Documents/mydoc ...szm0001/ShiwenMao_dissertation.pdf · viii previous work, our scheme is more realistic, and is easier to implement. Depending

146

Figure 5.11: Gains in number of admissible class 3 sources using the LMP bound over

the segregated system.

submatrices, as S =

S11 S12

S21 S22

, where S11 is of order m ×m and S22 is of order

n× n. Then, we present the following two theorems from [120].

Theorem 5.3 [120]: The stationary distribution of the fluid flow is phase type with

representation PH(Λ,U) of order m, where Λ = π1+ π2W. The tail probability of

the queue distribution is given by:

G(x) = ΛeUx1, for x ≥ 0. (5.22)

where

U = S11 + S12

∫ ∞

0eS22yS21e

Uydy (5.23)

and

W =∫ ∞

0eS22 S21e

Uydy. (5.24)

Choose a number θ ≥ maxi(−Sii), and let T = θ−1U + I, Pii = θ−1Sii + I,

Page 164: C:/Documents and Settings/Shiwen Mao/My Documents/mydoc ...szm0001/ShiwenMao_dissertation.pdf · viii previous work, our scheme is more realistic, and is easier to implement. Depending

147

and Pij = θ−1Sij, for i �= j. We can define the following matrices:

A2 =

P11 0

12P21 0

, A1 =

0 P12

0 12P22

, A0 =

0 0

0 12I

. (5.25)

Theorem 5.4 [120]: The rate matrix of the QBD defined by (5.25) is given by:

G =

T 0

W 0

. (5.26)

The proofs of the above two theorems can be found in [120].

5.6.2 Computing the Rate Matrix

The G-matrix of the QBD defined by (5.25) is the solution of a non-linear matrix

equation, which can be found by successive substitutions using [117][118],

Gn = (I − A1 −A0Gn−1)−1A2. (5.27)

Note the matrices in (5.25) are quite sparse. Further exploiting this fact and the

special structure of the G matrix, an iteration scheme for computing T and W can

be designed as follows:

W = 12(I − 1

2P22)

−1P21;

do

Wold = W;

W = 12(I − 1

2P22 − 1

2WP12)

−1(WP11 + P21);

until ‖ W − Wold ‖∞< ε;

T = P11 + P12W;

output T and W.

This iteration scheme has the following convergence property:

Corollary 5.5: The iterative scheme proposed above converges linearly.

Proof : Plug in (5.25) and (5.26) into (5.27), we have

T 0

W 0

=

I −P12

−12W I − 1

2P22

−1 P11 0

12P21 0

Page 165: C:/Documents and Settings/Shiwen Mao/My Documents/mydoc ...szm0001/ShiwenMao_dissertation.pdf · viii previous work, our scheme is more realistic, and is easier to implement. Depending

148

=

P11 + P12W 0

12(I − 1

2P22 − 1

2WP12)

−1(WP11 + P21) 0

,which gives: T = P11 + P12W

W = 12(I − 1

2P22 − 1

2WP12)

−1(WP11 + P21)

Thus this algorithm is essentially the same as the algorithm based on (5.27) in [118]

and they have the same linear convergence characteristics.

The matrices used in this scheme are of the order of m or n. Recall that

m is the number of overload states and n is the number of underload states. This

algorithm uses smaller matrices rather than directly calculating G using the Ai’s.

Our experiments show that, for typical cases, this scheme requires the same number

of iterations as the linear convergence algorithm in [118] but is about 10 times faster.

5.6.3 Caudal Characteristics of Fluid Queues

It is well known that if a QBD is positive recurrent then its steady-state probability

vector { π0, π1, π2, · · ·} has a matrix geometric form and decays geometrically with

rate η. Thus η essentially describes the tail behavior of the model, and is called the

Caudal Characteristics factor of the QBD [123]. It is closely related to the asymptotic

decay rate, or the effective bandwidth in the large deviation theory [108][112]. Here

we define the Caudal Characteristics factor of fluid queues.

Corollary 5.6: The Caudal Characteristics factor of the fluid queue defined in Sec-

tion 5.6.2, η, is

η = MRE(U) (5.28)

= θ(MRE(T) − 1) (5.29)

= θ(MRE(G) − 1). (5.30)

Proof : Let ψi and ξi be the normalized left and right eigenvectors of U associated

Page 166: C:/Documents and Settings/Shiwen Mao/My Documents/mydoc ...szm0001/ShiwenMao_dissertation.pdf · viii previous work, our scheme is more realistic, and is easier to implement. Depending

149

with the eigenvalues σi, i = 1, · · ·, m. Then from Theorem 2.3,

G(x) = Λ[ξ1 · · · ξm

]eσ1x · · · 0

· · · · · · · · ·0 · · · eσmx

ψ1

· · ·ψm

1

=m∑

i=1

< Λ, ξi > eσix < ψi, 1 > .

Suppose η = maxi{σi} and η be the kth eigenvalue of U. When x is large, the η term

dominates,

G(x) �< Λ, ξk > eηx < ξk, 1 > .

Thus η determins the tail behavior of the fluid queue for large buffers. From (5.26),

it is obvious that T has the same eigenvalues as G. From the definition of T, the

relationship between its eigenvalues and that of U can be derived.

For the discrete-time, discrete space QBD with rate matrix G, algorithms

are proposed in [123] to compute the Caudal Characteristic factor, which is more

interesting when the order of G is so large that an exact computation of G, and

therefore also of the exact tail distribution is not feasible. If the phase space is

decomposable, fast algorithms are provided in [123].

Corollary 5.7: Let η be the kth eigenvalue of U, and ψk and ξk denote the kth left

and right eigenvectors of U. When x is large, the tail distribution of the fluid queue

can be approximated by:

G(x) �< Λ, ξk >< ξk, 1 > eηx, (5.31)

Proof : See the proof of Corollary 5.6.

There are efficient methods on computing the dominant eigenpair of a matrix

available in the matrix theory literature. Eq.(5.31) can be used for the cases where a

fast approximation is needed.

5.6.4 Numerical Investigations

In this section we present some of the numerical results on GPS analysis using the

Matrix Analytic Method, which illustrate the quantitative and qualitative aspects of

Page 167: C:/Documents and Settings/Shiwen Mao/My Documents/mydoc ...szm0001/ShiwenMao_dissertation.pdf · viii previous work, our scheme is more realistic, and is easier to implement. Depending

150

the technique. In the experiments we performed, we use classes of sources where each

class is the aggregation of a number of on-off sources. The source parameters are

given in Table 5.8. The analysis results are compared with fluid simulation with 95%

confidence intervals to verify their correctness.

Table 5.8: On-off Source Parameters Used in Figures 5.12, 5.13, and 5.14.

- Type 1 Type 2 Type 3 Type 4

α 0.40 0.40 1.00 0.56

β 1.00 1.00 1.00 0.83

λ 1.00 1.20 0.61 0.80

Consider a 3-class GPS system where Class 1 has 10 Type 1 sources; class

2 has 10 Type 2 sources; and class 3 has 10 Type 3 sources. The service rate is

c = 15.1 and the GPS weights are ω1 = 0.33, ω2 = 0.40, and ω3 = 0.27, respectively.

The system load is about 61%, which is not atypical in traffic engineering. Fig.5.12

shows the tail distributions of the classes. It can be seen that under this load our

service bound (5.19) gives a good approximation. All three analysis curves match

simulations for the whole buffer region. We also plot the analytic results for the tail

distributions of the classes using the LZT bound for comparison purpose. The LMP

bound results match the simulation more closely than that of the LZT bound. As

an example on the reduced matrix sizes using this approach, for class 1, the derived

equivalent system in Fig.5.4 has 616 states, while the U matrix of this system is of a

size 100×100.

To further illustrate the effectiveness of the technique, we study a 3-queue

GPS system with a video class and two voice classes. The video model is from [104],

in which a four-state Discrete-time Markov Modulated Poisson Process (DMMPP) is

used to model video traffic. A video source has transition matrix M and rate vector

R as given below. Note αi = (1 − αi) and βi = (1 − βi) in M. The parameters are

matched from the MPEG-1 Star Wars movie in [104], and are given in Table 5.9. We

use its fluid equivalent in the analysis and simulations. A voice source is modeled

Page 168: C:/Documents and Settings/Shiwen Mao/My Documents/mydoc ...szm0001/ShiwenMao_dissertation.pdf · viii previous work, our scheme is more realistic, and is easier to implement. Depending

151

1e-07

1e-06

1e-05

1e-04

1e-03

1e-02

1e-01

0 1 2 3 4 5

Tai

l Dis

trib

utio

n

Queue Occupancy

class 1(analysis,LMP)class 1(analysis,LZT)

95% conf. Int.class 2(analysis,LMP)class 2(analysis,LZT)

95% conf. Int.class 3(analysis,LMP)class 3(analysis,LZT)

95% conf. Int.

Figure 5.12: Tail distributions of a 3-queue GPS system.

using parameters of Type 4 in Table 5.8. Class 1 consists of 2 video sources, while

class 2 and class 3 are identical, consisting of 20 voice sources each. The GPS weights

for the three classes are 0.54, 0.23, and 0.23, respectively, and the service rate is 306.6.

Fig.5.13 plots the tail distributions of the classes. Again, the analysis results match

the simulation results closely for the whole buffer region. We also plot the tails of

the classes using (5.31). The approximations almost overlap with the exact analysis

except for small buffer regions. The equivalent system derived for class 1 has 1600

states, while the U matrix used in the Matrix Analytic Method analysis is of a size

14×14.

T =

α1α2 α1α2 α1α2 α1α2

α1β2 α1β2 α1β2 α1β2

β1α2 β1α2 β1α2 β1α2

β1β2 β1β2 β1β2 β1β2

, R =[λ1 λ2 λ3 λ4

]

Next we examine the Caudal Characteristics (η defined in (5.28)) of the

classes with a GPS server. In Fig.5.14, we plot the Caudal Characteristic curve

of class 1 in a 3-class GPS system for four different GPS weights. In the 3-class

Page 169: C:/Documents and Settings/Shiwen Mao/My Documents/mydoc ...szm0001/ShiwenMao_dissertation.pdf · viii previous work, our scheme is more realistic, and is easier to implement. Depending

152

1e-08

1e-07

1e-06

1e-05

1e-04

1e-03

1e-02

1e-01

0 50 100 150 200 250 300 350 400

Tai

l Dis

trib

utio

n

Queue Occupancy

the video class(analysis)dominant eig approx. of vidoe clas

95% conf. Int.the voice classes(analysis)

dominant eig appro. of voice class95% conf. Int.

Figure 5.13: Tail distributions of a 3-queue GPS system, where class 1 has two video

sources and class 2 and 3 have 20 voice sources each.

Table 5.9: Video Source Parameters Used in Fig. 5.13

α1 α2 β1 β2 λ1 λ2 λ3 λ4

0.0018 0.00064 0.1568 0.0234 95.24 58.9 73.92 37.58

GPS system, class 1 has two Type 1 sources, class 2 has two Type 2 sources, and

class 3 has two Type 3 sources. In Fig.5.14, we increases ω1 from 1/5 to 1/2, while

ω2 = ω3 = 12(1 − ω1). The system load is varied from 0.35 to 1 by adjusting the

service rate. Different GPS weights yield different Caudal Characteristic curves. The

higher the GPS weight, the lower the η. This illustrates the separation and protection

features of GPS servers and allows for the selection of the right GPS weights to meet

QoS requirements.

Page 170: C:/Documents and Settings/Shiwen Mao/My Documents/mydoc ...szm0001/ShiwenMao_dissertation.pdf · viii previous work, our scheme is more realistic, and is easier to implement. Depending

153

-12

-10

-8

-6

-4

-2

0

2

0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

Cau

dal C

hara

cter

istic

Fac

tor

(eta

)

load (rho)

omega1=1/5omega1=1/4omega1=1/3omega1=1/2

Figure 5.14: Caudal Characteristics of class 1 queue in a three-queue GPS system

versus system load and with different GPS weights.

5.7 Summary

In this chapter, we focused on the network support for multimedia transport. Among

the many key elements in the QoS provisioning in a future network architecture, we

investigated scheduling and admission control, which are widely studied and applied

in the current Internet. More specifically, we chose the GPS scheduling discipline and

analyzed the queue occupancy distributions of the classes, each modeled as a MMFP

process.

We presented a simple and scalable analytical technique for determining the

tail distributions of MMFP sources in the GPS system, based on the LZT bound

introduced in [100]. The effective bandwidth of the MMFP class is derived and a

admission control scheme is designed. Then, by observing that the LZT bound is

conservative as the number of classes increases, we proposed a tighter bound on the

service a class receives. This service bound, denoted as the LMP bound, captures

both the correlations among the classes and the dynamics of the GPS service sharing.

Page 171: C:/Documents and Settings/Shiwen Mao/My Documents/mydoc ...szm0001/ShiwenMao_dissertation.pdf · viii previous work, our scheme is more realistic, and is easier to implement. Depending

154

The LMP bound results in a more accurate analysis than the LZT bound, which

translates to better bandwidth efficiency. We also examined the computational aspect

of GPS analysis, by extending the previous work on the Matrix Analytic Method to

GPS analysis. We also presented our numerical studies illustrating the accuracy and

efficiency of these technique.

Page 172: C:/Documents and Settings/Shiwen Mao/My Documents/mydoc ...szm0001/ShiwenMao_dissertation.pdf · viii previous work, our scheme is more realistic, and is easier to implement. Depending

155

Chapter 6

Summary and Future Work

6.1 Summary

In the previous chapters, we proposed a general architecture on multimedia transport

using multiple paths. This architecture consists of four key components, i.e., (1) a

multistream encoder which encodes an incoming multimedia data flow and generates

multiple compressed substreams, (2) a multipath routing protocol that maintains mul-

tiple paths between a source and a destination, (3) a traffic allocator that performs

traffic partitioning and allocates packets to the multiple paths, and (4) a resequenc-

ing buffer that reorder the received packets. A key to the success of the proposed

architecture is the close interaction between these essential components, which entails

careful cross-layer design.

We further investigate the components in this architecture in various set-

tings. First we studied the problem of enabling video transport over wireless mobile

ad hoc networks in Chapter 2. This is a very difficult problem because ad hoc paths

are ephemeral and video quality is susceptible to transmission losses. We chose three

representative video coding schemes (all based on the MCP technique used in all the

modern video coding standards), and showed how to adapt these schemes with MPT.

We studied the performance of the three proposed schemes using Markov model and

OPNET simulations. Our results show that using of multiple paths provides a power-

ful means of combating transmission errors. In addition, multistream coding provides

Page 173: C:/Documents and Settings/Shiwen Mao/My Documents/mydoc ...szm0001/ShiwenMao_dissertation.pdf · viii previous work, our scheme is more realistic, and is easier to implement. Depending

156

a novel means of traffic partitioning, where redundancy can be introduced in an ef-

ficient and well controlled manner. To further validate the feasibility, as well as to

demonstrate the benefits, of using these proposed schemes, we implemented an ad

hoc multipath video streaming testbed. We performed extensive experiments. The

testbed results show that video transport over ad hoc networks is viable for both the

LC with ARQ and MDMC schemes in the settings we examined.

In Chapter 3, we presented an analytical framework on the remaining two

key components in the proposed architecture, i.e., the traffic allocator and the rese-

quencing buffer. We modeled the paths as a combination of a queue with a constant

service rate and a fixed delay line. We also assume the multimedia flow is regulated

by a leaky bucket, and is partitioned with a deterministic splitting scheme. With this

framework, we formulated a constrained optimization problem and derived its closed

form solution. Our results apply to the multiple paths cases, and provide an easy

means for path selection.

In Chapter 4, we presented a new application layer protocol to support

multimedia transport using multiple paths. The proposed MRTP/MRTCP protocol

is a natural extension of the RTP/RTCP protocol to multiple paths, and is com-

plementary to SCTP in its support of multimedia services. We also presented two

performance studies of the proposed protocol. First, we studied the effect of traf-

fic partitioning on the queueing performance of the multimedia flows. The results

show that traffic partitioning can effectively reduce the short term autocorrelations

of the flows, thus improving their queueing performance. We also compared MRTP

with RTP by simulating a wireless mobile ad hoc network with a video session using

OPNET. MRTP outperforms RTP in all the cases we examined.

The preceding chapters focus on the end-to-end support for multimedia

transport, assuming a best-effort IP network. In Chapter 5, we analyzed a network

node (e.g., a router or a switch) using GPS scheduling to provide QoS guarantees

for multimedia flows. We analyzed a multiple class GPS system, where each class

is modeled as an MMFP. We first derived the effective bandwidth of a MMFP class

in the GPS system, and then designed an admission control scheme based on the

analysis. We also presented a tight service bound that decouples the GPS system.

Page 174: C:/Documents and Settings/Shiwen Mao/My Documents/mydoc ...szm0001/ShiwenMao_dissertation.pdf · viii previous work, our scheme is more realistic, and is easier to implement. Depending

157

Numerical results show that this new bound is tighter than previous work, resulting

in a higher bandwidth utilization. Finally, we extended the previous work on Matrix

Analytic Methods for stochastic flows to GPS analysis.

6.2 Future Work

For the results in Chapter 2, we should note that further improvements could be

made for each component in the proposed framework. For example,

• The video codec parameters could be further tuned and optimized in the rate

distortion sense, given the path conditions.

• Packets from all the substreams could be dispersed to the multiple paths with

a more sophisticated algorithm to maximize the benefit of MPT.

• The ad hoc networks simulated in Markov model and OPNET models, and

the testbed are relative small. It would be very interesting to see how the

performance scales in a larger ad hoc network.

These are still open research problems that are worth investigating in future work.

We have implemented MDSR on the Microsoft Windows platform. Further experi-

ments with the dynamic routing protocol and a larger ad hoc network would be very

interesting.

Our analytical framework on the optimal traffic partitioning can be extended

in the following ways:

• Replace deterministic queueing analysis with probabilistic queueing analysis to

achieve higher bandwidth utilization, and

• incorporate a loss component to model the wireless network paths.

We are currently working on these two directions.

A working implementation of the proposed protocol, e.g., a MRTP/MRTCP

testbed, would be useful in validating its pros and cons. Furthermore, although

Page 175: C:/Documents and Settings/Shiwen Mao/My Documents/mydoc ...szm0001/ShiwenMao_dissertation.pdf · viii previous work, our scheme is more realistic, and is easier to implement. Depending

158

multipath transport has inherent security strength (since it would be difficult for an

attacker to track all the flows in use and to determine how the traffic is partitioned),

security considerations are not yet the focus of our design. Besides the randomly

generated session/flow IDs and initial sequence numbers, security can be strengthened

by introducing some randomness in data partitioning. In addition, we are working

on an Internet Draft on MRTP/MRTCP for the IETF.

The GPS analysis presented in Chapter 5 assumes an infinite buffer. How-

ever, if the buffer size is finite (which is true for all practical routers), we observed

that the GPS system has the same performance as a FCFS server if no buffer man-

agement scheme is used. We are working on the analysis of a GPS system with a

complete sharing virtual partitioning buffer management policy.

Page 176: C:/Documents and Settings/Shiwen Mao/My Documents/mydoc ...szm0001/ShiwenMao_dissertation.pdf · viii previous work, our scheme is more realistic, and is easier to implement. Depending

159

Bibliography

[1] IEEE 802.11, Draft Standard for Wireless LAN: Medium Access Control (MAC)

and Physical Layer (PHY) Specifications IEEE, July 1996.

[2] C. R. Lin and M. Gerla, “Asynchronous multimedia multihop wireless networks,”

in Proc. IEEE/ACM INFOCOM, pp.118-125, Kobe, Japan, Apr. 1997.

[3] Y. Wang, S. Wenger, J. Wen, and A. K. Katsaggelos, “Error resilient video coding

techniques,” IEEE Signal Processing Mag., vol.17, pp.61-82, July, 2000.

[4] Y. Wang and Q.-F. Zhu, “Error control and concealment for video communication:

a review,” in Proc. IEEE, vol.86, issue 5, pp.974-997, May 1998.

[5] N. Gogate, D. Chung, S. S. Panwar, Y. Wang, “Supporting image/video applica-

tions in a multihop radio environment using route diversity and multiple descrip-

tion coding,” IEEE Trans. Circuit Syst. Video Technol., vol.12, no.9, pp.777-792,

Sept. 2002.

[6] S. Lin, S. Mao, Y. Wang, and S. S. Panwar, “A reference picture selection scheme

for video transmission over ad hoc networks using multiple paths,” in Proc. IEEE

ICME, Tokyo, Japan, Aug. 2001.

[7] S. Mao, S. Lin, S. S. Panwar, and Y. Wang, “Reliable transmission of video over

ad hoc networks using Automatic Repeat Request and multipath transport,” in

Proc. IEEE Fall VTC, pp.615-619, Atlantic City, NY, Oct. 2001.

[8] S. Mao, S. Lin, S. S. Panwar, and Y. Wang, “Video transport over ad hoc networks:

Multistream coding with multipath transport,” IEEE Journal on Selected Areas

in Communications, Special Issue on Recent Advances in Wireless Multimedia.

vol.21, no.10, December 2003.

Page 177: C:/Documents and Settings/Shiwen Mao/My Documents/mydoc ...szm0001/ShiwenMao_dissertation.pdf · viii previous work, our scheme is more realistic, and is easier to implement. Depending

160

[9] Y. Wang and S. Lin, “Error resilient video coding using multiple description

motion compensation,” IEEE Trans. Circuits Syst. Video Technol., vol.12, no.6,

pp.438-452, 2002.

[10] E. N. Gilbert, “Capacity of a bursty-noise channel,” Bell Syst. Tech. J., vol.39,

no.9, pp.1253-1265, Sept. 1960.

[11] OPNET Tech., Inc. OPNET Modeler. Homepage: http://www.mil3.com.

[12] S. Wenger, “Video redundancy coding in H.263+,” in Proc. Workshop Audio-

Visual Services for Packet Networks, Aberdeen, Scotland, Sept. 1997.

[13] M. Khansari, A. Jalali, E. Dubois, and P. Mermelstein, “Low bit-rate video

transmission over fading channels for wireless microcellular systems,” IEEE Trans.

Circuit Syst. Video Technol., vol.6, no.1, pp.1-11, Feb. 1996.

[14] H. Schulzrinne, S. Casner, R. Frederick, and V. Jacobson, “RTP: A transport

protocol for realtime applications,” IETF Request For Comments 1889. [Online].

Available: http://www.ietf.org.

[15] V. K. Goyal, “Multiple Description Coding: compression meets the network,”

IEEE Sigal Processing Mag., vol.18, pp.74-93, Sept. 2001.

[16] A. Reibman, H. Jafarkhani, Y. Wang, M. Orchard, and R. Puri, “Multiple de-

scription coding for video using motin compensated prediction,” in Proc. IEEE

ICIP, pp.837-841, Kobe, Japan, Oct. 1999.

[17] J. G. Apostolopoulos, “Error-resilient video compression through the use of mul-

tiple states,” in Proc. IEEE ICIP, pp.352-355, Vancouver, BC, Canada, Sept.

2000.

[18] de M. Cordeiro, et al, “Establishing a trade-off between unicast and multi-

cast retransmission modes for reliable multicast protocols” in Proc. Inter. Symp.

Modeling, Analysis and Simulation of Computer and Telecommunication Systems,

pp.85-91, 2000.

Page 178: C:/Documents and Settings/Shiwen Mao/My Documents/mydoc ...szm0001/ShiwenMao_dissertation.pdf · viii previous work, our scheme is more realistic, and is easier to implement. Depending

161

[19] N. F. Maxemchuck, “Diversity routing,” in Proc. IEEE ICC, vol.1, pp.10-41, San

Francisco, CA, June 1975.

[20] T. Wu and R. Lau, “A class of self-healing ring architectures for SONET network

applications,” IEEE Trans. Commun., vol.40, no.11, pp.1746-1756, Nov. 1992.

[21] R. R. Stewart and Q. Xie, Stream Control Transmission Protocol: A Reference

Guide. Reading, MA: Addison-Wesley, 2001.

[22] D. Sidhu, R. Nair, and S. Abdallah, “Finding disjoint paths in networks,” in

Proc. ACM SIGCOMM, pp.43-51, Zurich, Switzerland, Sept. 1991.

[23] S.-J. Lee and M. Gerla, “Split multipath routing with maximally disjoint paths

in ad hoc networks,” in Proc. IEEE ICC, pp.3201-3205, Helsinki, Finland, June

2001.

[24] P. Papadimitratos, Z. J. Haas, and E. G Sirer, “Path set selection in mobile ad

hoc networks,” in Proc. ACM MOBIHOC, pp.1-11, Lausanne, Switzerland, June

2002.

[25] A. Nasipuri, R. Castaneda, and S. R. Das, “Performance of multipath routing

for on-demond protocols in mobile ad hoc networks,” Mobile Networks and Ap-

plications, vol.6, no.4, pp.339-349, 2001.

[26] S. Nelakuditi and Z. Zhang, “On selection of paths for multipath routing,” Proc.

IEEE IWQoS, June, 2001.

[27] J. Moy, “OSPF version 2,” IETF Request For Comments 2328, April 1998.

[Online]. Available at: http://www.ietf.org.

[28] E. Gustafsson and G. Karlsson, “A literature survey on traffic dispersion,” IEEE

Network, pp.28-36, March/April 1997.

[29] D. B. Johnson, D. A. Maltz, Y.-C. Hu, and J. G. Jetcheva, The Dynamic Source

Routing Protocol for Mobile Ad Hoc Networks , IETF Internet Draft (draft-ietf-

manet-dsr-03.txt), Oct., 1999 (work in progress).

Page 179: C:/Documents and Settings/Shiwen Mao/My Documents/mydoc ...szm0001/ShiwenMao_dissertation.pdf · viii previous work, our scheme is more realistic, and is easier to implement. Depending

162

[30] C. E. Perkins, E. M. Royer, and S. R. Das, Ad hoc on-demand distance vector

(AODV) routing, IETF Internet Draft (draft-ietf-manet-aodv-06.txt), July 2000

(work in progress).

[31] Z. J. Haas and M. R. Pearlman, The Zone Routing Protocol (ZRP) for ad hoc

networks, IETF Internet Draft (draft-ietf-manet-zone-zrp02.txt), June 1999 (work

in progress).

[32] M. Pearlman, Z. J. Hass, P. Sholander, and S. S. Tabrizi, “On the impact of

alternate path routing for load balancing in mobile Ad Hoc networks,” in Proc.

MobiHOC, pp.3-10, Aug. 2000.

[33] N. Gogate and S. S. Panwar, “On a resequencing model for high speed networks,”

Proc. IEEE INFOCOM, pp.40-47, June 1994.

[34] Y. Nebat and M. Sidi, “Resequencing considerations in parallel downloads,” in

Proc. IEEE INFOCOM, pp.1326-1336, June 2002.

[35] Gogate, N. and S. S. Panwar, “Supporting applications in a mobile multihop

radio enviroment using route diversity: I. Non-realtime data,” Proc. IEEE ICC,

pp.802-806, June 1998.

[36] D. S. Phatak and T. Goff, “A novel mechanism for data streaming across multiple

IP links for improving throughput and reliabilit in mobile environments,” in Proc.

IEEE INFOCOM, pp.773-782, June 2002.

[37] N. Gogate and S. S. Panwar, “Supporting video/image applications in a mobile

multihop radio environment using route diversity,” in Proc. IEEE ICC, pp.1701-

1706, June 1999.

[38] J. G. Apostolopoulos, “Reliable video communication over lossy packet networks

using multiple state encoding and path diversity,” in Proc. SPIE Conf. Visual

Commun. Image Processing, pp.392-409, Jan. 2001.

Page 180: C:/Documents and Settings/Shiwen Mao/My Documents/mydoc ...szm0001/ShiwenMao_dissertation.pdf · viii previous work, our scheme is more realistic, and is easier to implement. Depending

163

[39] J. G. Apostolopoulos, T. Wong, W. Tan, and S. Wee, “On Multiple Description

Streaming in Content Delivery Networks,” in Proc. IEEE INFOCOM, pp.1736-

1745, June 2002.

[40] T. Nguyen and A. Zakhor, “Path Diversity with Forward Error Correction (PDF)

system for packet switched networks,”, in Proc. IEEE INFOCOM, San Francisco,

CA, April 2003.

[41] Y. J. Liang, E. G. Steinbach, and B. Girod, “Multi-stream voice over IP using

packet path diversity,” in Proc. IEEE Multimedia Siganl Processing Workshop,

pp.55-560, Sept. 2001.

[42] Y. J. Liang, E. Setton, and B. Girod, “Channel-adaptive video streaming using

packet path diversity and rate-distortion optimized reference picture selection,”

presented at the IEEE Multimedia Signal Processing Workshop, St. Thomas, US

Virgin Islands, Dec. 2002.

[43] V. Jacobson, “Congestion avoidance and control,” Computer Communication

Review, vol.19, no.4, pp.314-329, Auguest, 1988.

[44] D. A. Maltz, J. Broch, and D. B. Johnson, “Lessons from a full-scale multi-hop

wireless ad hoc network testbed,” in IEEE Pers. Commun., vol.8, issue 1, pp.8-15,

February 2001.

[45] H. Lundgren, et al, “A large-scale testbed for reproducible ad hoc protocol eval-

uations,” in Proc. IEEE WCNC, pp.412-418, 2002.

[46] W. Kellerer, E. Steinbach, P. Eisert, and B. Girod, “A real-time internet stream-

ing media testbed,”, in Proc. IEEE ICME, pp.453-456, Aug. 2002.

[47] W. R. Stevens, TCP/IP Illustrated, Volume 1: The Protocols. Reading, MA:

Addison-Wesley, 1994.

[48] ITU-T Recommendation H.263, Video coding for low bit rate communication,

1998.

Page 181: C:/Documents and Settings/Shiwen Mao/My Documents/mydoc ...szm0001/ShiwenMao_dissertation.pdf · viii previous work, our scheme is more realistic, and is easier to implement. Depending

164

[49] The University of British Columbia, The H.263+ Codec Implementation. [On-

line]. Available: ftp://dspftp.ece.ubc.ca.

[50] A. Nasipuri and S. R. Das, “On-demand multipath routing for mobile ad hoc

networks,” in Proc. ICCCN, pp.64-70, 1999.

[51] The National Institute of Standards and Technology, OPNET DSR Model. [On-

line]. Available: http://w3.antd.nist.gov/wctg/prd dsrfiles.html.

[52] T.K. Philips, S.S. Panwar, and A.N. Tantawi, “Connectivity properties of a

packet radio network model,” IEEE Trans. Inform. Theory, vol.35, pp.1044-1047,

September 1989.

[53] J. Broch, D. A. Maltz, D. B. Johnson, Y.-C. Hu, and J. Jetcheva, “A performance

comparison of multi-hop wireless ad hoc network routing protocols,” in Proc.

ACM/IEEE Inter. Conf. Mobile Comp. and Networking, pp.85-97, 1998.

[54] J. Yoon, M. Liu, and B. Noble, “Random waypoint considered harmful,” in Proc.

INFOCOM, San Francisco, CA, April, 2003.

[55] D. A. Pierre, Optimization Theory with Applications, New York: Dover Publica-

tions, Inc., 1986.

[56] D. Y. Eun and N. B. Shroff, “Simplification of network analysis in large-

bandwidth systems,” in Proc. of IEEE/ACM INFOCOM, vol.1, pp 597-607,

March 2003.

[57] K. Sayrafian-Pour, M. Alasti, A. Ephremides, and N. Farvardin, “The effects of

multiple routing on the end-to-end average distortion,” in Proc. ISIT, Washington,

D.C., June 2001.

[58] M. Alasti, K. Sayrafian-Pour, A. Ephremides, and N. Farvardin, “Multiple de-

scription coding in networks with congestion problem,” IEEE Transactions on

Information Theory, vol.47, no.3, March 2001.

Page 182: C:/Documents and Settings/Shiwen Mao/My Documents/mydoc ...szm0001/ShiwenMao_dissertation.pdf · viii previous work, our scheme is more realistic, and is easier to implement. Depending

165

[59] P. Decuetos and K.W. Ross, “Unified framework for optimal video streaming,”

to appear, IEEE/ACM INFOCOM 2004, Hong Kong, February 2004.

[60] P. Decuetos and K.W. Ross, “Optimal streaming of layered video: Joint schedul-

ing and error concealment,” ACM Multimedia 2003, San Francisco, November

2003.

[61] P. de Cuetos, M. Reisslein, K. W. Ross, “Evaluating the streaming of FGS-

encoded video with rate-distortion traces,” Institut Eurecom Technical Report RR-

03-078, June 2003.

[62] A. Tsirigos and Z. J. Haas, “Multipath routing in the presence of frequent topo-

logical changes,” IEEE Communications Magazine, vol.39, issue:11, Nov. 2001.

[63] Cisco IOS documentation, “Cisco IOS Configuration Fundamentals Configura-

tion Guide - Release 12.2,” [online]. Available: http://www.cisco.com.

Y. Wang and Q.-F. Zhu, “Error control and concealment for video communication:

a review,” in Proc. IEEE, vol.86, issue 5, pp.974-997, May 1998.

[64] S. Mao, S. Lin, D. Bushmitch, S. Narayanan, S. S. Panwar, Y. Wang, and R.

Izmailov, “Real time transport with path diversity,” the 2nd NY Metro Area Net-

working Workshop, New York, September 2002.

[65] Y. J. Liang, E. G. Steinbach, and B. Girod, “Multi-stream voice over IP using

packet path diversity,” in Proc. IEEE Multimedia Siganl Processing Workshop,

pp.555-560, Sept. 2001.

[66] E. M. Royer and C.-K. Toh, “A review of current routing protocols for ad hoc

mobile wireless networks,” IEEE Personal Communications, vol.6 issue.2, pp.46-

55, April 1999.

[67] J. Lee, “Parallel video servers: A tutorial,” IEEE Multimedia, vol.5, issue.2,

pp.20-28, April-June, 1998.

Page 183: C:/Documents and Settings/Shiwen Mao/My Documents/mydoc ...szm0001/ShiwenMao_dissertation.pdf · viii previous work, our scheme is more realistic, and is easier to implement. Depending

166

[68] M. Reisslein, K. Ross, and S. Shrestha, “Striping for interactive video: Is it

worth it?” in Proceedings of the IEEE International Conference on Multimedia

Computing and Systems, vol.2, pp.7-11, June, 1999.

[69] S. V. Anastasiadis, K. C. Sevcik, and M. Stumm, “Server-based smoothing of

variable bit-rate streams,” in Proceedings of the ACM International Conference

on Multimedia, pp.147-158, October, 2001.

[70] R. Chow, C. Lee, and J. Liu, “Traffic dispersion strategies for multimedia stream-

ing,” in Proceedings of the IEEE Workshop on Future Trends of Distributed Com-

puting Systems, pp.18-24, October-November, 2001.

[71] P. J. Shenoy and H. M. Vin, “Efficient striping techniques for multimedia file

servers,” Performance Evaluation, vol.38, pp.175-199, 1999.

[72] D. Bushmitch, R. Izmailov, S. Panwar, A. Pal, “Thinning, Striping and Shuffling:

Traffic Shaping and Transport Techniques for Variable Bit Rate Video,” in Proc.

IEEE GLOBECOM’02, Taipei, 2002.

[73] D. Bushmitch, “Thinning, striping and shuffling: Traffic shaping and transport

techniques for VBR video,” PhD Dissertation, Electrical and Computer Engineer-

ing Department, Polytechnic University, 2003.

[74] R. R. Stewart and Q. Xie, Stream Control Transmission Protocol: A Reference

Guide. Reading, MA: Addison-Wesley, 2001.

[75] H.-Y. Hsieh and R. Sivakumar, “A transport layer approach for achieving aggre-

gate bandwidths on multi-homed mobile hosts,” in Proc. ACM Inter. Conf. Mob.

Comp. Networking, pp.83-95, September 2002.

[76] W. E. Leland, M. S. Taqqu, W. Willinger, and D. V. Wilson, “On the self-

similar nature of Ethernet traffic,” IEEE/ACM Trans. Networking, vol.2, pp.1-15,

February 1994.

[77] B. Ryu and A. Elwalid, “The importance of LRD of VBR video traffic in ATM

traffic engineering: Myths and realities,” in Proc. ACM SIGCOMM, 1996.

Page 184: C:/Documents and Settings/Shiwen Mao/My Documents/mydoc ...szm0001/ShiwenMao_dissertation.pdf · viii previous work, our scheme is more realistic, and is easier to implement. Depending

167

[78] S. Mao, S. Lin, S. S. Panwar, and Y. Wang, “Reliable transmission of video

over ad hoc networks using automatic repeat request and multipath transport,”

in Proc. IEEE VTC Fall 2001, vol.2, pp.615-619, October 2001.

[79] Y. Wang and S. Lin, “Error resilient video coding using multiple description

motion compensation,” IEEE Trans. Circuits Syst. Video Technol., vol.12, no.6,

pp.438-452,September 2002.

[80] N. Gogate and S. S. Panwar, “On a resequencing model for high speed networks,”

in Proc. IEEE INFOCOM’94, vol.1, pp.40-47, April 1994.

[81] Y. Nebat and M. Sidi, “Resequencing considerations in parallel downloads,” in

Proc. IEEE INFOCOM’03, April 2003.

[82] J. Rosenberg, et al, SIP: Session Initiation Protocol, IETF RFC 3261, June 2002.

[83] C. Huitema, IPv6: The new Internet Protocol. Prentice Hall, 1998.

[84] S. Sajama and Z. J. Haas, “Independent-tree ad hoc multicast routing (ITA-

MAR),” in Proc. IEEE Fall VTC’01, October 2001.

[85] M. Montgomery and G. De Veciana, “On the relevance of time scales in perfor-

mance oriented traffic characterization,” in Proc. IEEE INFOCOM, 1996.

[86] S. Mao, D. Bushmitch, S. Narayanan, and S. S. Panwar, “MRTP: A Multi-Flow

Realtime Transport Protocol for Ad Hoc Networks,” in Proceedings of the IEEE

VTC Fall’03, October 2003.

[87] R. Krishnan and J. A. Silvester, “Choice of allocation granularity in multipath

source routing schemes,” in Proc. IEEE/ACM INFOCOM, pp.322-329, March,

1993.

[88] C. Cetinkaya and E. W. Knightly, “Opportunistic traffic scheduling over multiple

network paths,” to appear, IEEE/ACM INFOCOM, 2004.

Page 185: C:/Documents and Settings/Shiwen Mao/My Documents/mydoc ...szm0001/ShiwenMao_dissertation.pdf · viii previous work, our scheme is more realistic, and is easier to implement. Depending

168

[89] X. Liu, E. Chong, and N. Shroff, “Opportunistic transmission scheduling with

resource-sharing constraints in wireless networks,” IEEE JSAC, vol.19, no.10,

pp.2053-2065, 2002.

[90] A. C. Begen, Y. Altunbasak, O. Ergun, and M. H. Ammar, “Multi-path selection

for multiple description video streaming over overlay networks,” IEEE Transaction

on Image Processing, submitted 2003.

[91] The High-Performance Networks (HPN) program of DOE. [online]. Available:

http://www.sc.doe.gov/ascr/mics/hpn.

[92] L. Zhang, S. Deering, D. Estrin, S. Shenker, and D. Zappala, “RSVP: A new

resource reservation protocol,” IEEE Network, pp:8-18, September 1993.

[93] Z.-L. Zhang, End-to-End Support for Statistical Quality-of-Service Guarantees

in Multimedia Networks, Ph.D Dissertation, Department of Computer Science,

University of Massachusetts Amherst, February 1997.

[94] R. Braden, D. Clark, and S. Shenker, “Integrated services in the Internet archi-

tecture: An overview,” IETF RFC 1633, July 1994.

[95] S. Blake, et al, “An archtecture for differentiated services,” IETF RFC2457,

December 1998.

[96] A. G. Konheim, I. Meilijson, and A. Melkman, “Processor-sharing of two parallel

lines,” J. Appl. Prob., 18, pp:952-956, 1981.

[97] D. Nandita, J. Kuri, and H. S. Jamadagni, “Optimal call admission control in

generalized processor sharing (GPS) schedulers,” Proc. of IEEE INFOCOM 2001,

pp:468-477, 2001.

[98] Z. Zhang, D. Towsley, and J. Kurose, “Statistical analysis of generalized proces-

sor sharing scheduling discipline,” IEEE Journal on Selected Area in Communi-

cations, vol. 13, No. 6, pp. 1071-1080, August 1995.

Page 186: C:/Documents and Settings/Shiwen Mao/My Documents/mydoc ...szm0001/ShiwenMao_dissertation.pdf · viii previous work, our scheme is more realistic, and is easier to implement. Depending

169

[99] S. Borst, O. Boxma, and P. Jelenkovie,“Generalized Processor Sharing with

Long-Tailed traffic sources,” Proc. of ITC-16, pp.345-354, June, 1999.

[100] F. Lo Presti, Z. Zhang, and D. Towsley, “Bounds, approximations and applica-

tions for a two-queue GPS system,” Proc. of IEEE INFOCOM’96, March 1996.

[101] G. Lapiotis, S. Mao, and S. S. Panwar, “GPS Analysis of Multiple Sessions with

Applications to Admission Control,” Proc. IEEE ICC 2001, June 2001.

[102] D. Heyman and T.V. Lakshman, “Source models for VBR broadcast video

traffic,” IEEE/ACM Trans. Networking, vol. 5, No. 1, pp.40-48, February 1996.

[103] D. Heyman and T. V. Lakshman, “Statistical analysis and simulation study

of video teleconference traffic in ATM networks,” IEEE Trans. on Circuits and

Systems for Video Technology, vol. 2, No. 1, pp:49-59, March 1992.

[104] T. C. Yu and J. A. Silvester, “A four-state DMMPP for characterization multi-

media traffic with short-term and long-term correlation,” Proc. Of IEEE ICC’99,

vol. 2, pp.880-885, September 1999.

[105] K. Park, W. Willinger, Self-similar network traffic and performance evaluation,

John Wiley & Sons, Inc., 2000.

[106] A. Elwalid and D. Mitra, “Analysis, approximation and admission control of a

multi-service multiplexing system with priorities,” Proc. of IEEE INFOCOM’95,

pp.463-472, 1995.

[107] N. B. Shroff and M. Schwartz, “Improved Loss Calculations at an ATM Multi-

plexer,” IEEE/ACM Trans. Netwroking, vol. 6, No. 4, August 1998.

[108] A. Elwalid and D. Mitra, “Effective bandwidth of general markovian traffic

sources and admission control of high speed networks,” IEEE/ACM Trans. Net-

working, vol. 1, No. 3, pp.329-343, June 1993.

Page 187: C:/Documents and Settings/Shiwen Mao/My Documents/mydoc ...szm0001/ShiwenMao_dissertation.pdf · viii previous work, our scheme is more realistic, and is easier to implement. Depending

170

[109] J. N. Daigle and J. D. Langford, “Models for analysis of packet voice communi-

cations systems,” IEEE Journal on Selected Areas in Communications, vol.SAC

4, No.6, pp:847-855, September 1986.

[110] Y. Bernet, J. Binder, S. Blake, M. Carlson, B. Carpenter, S. Kenshav, E. Davies,

B. Ohlman, D. Verma, Z. Wang, and W. Weiss, “A framework for differentiated

services,” Internet draft, 1999.

[111] G. L. Choudhury, D. M. Lucantoni, and W. Whitt, “Squeezing the most out

of ATM,” IEEE Trans. on Communications, vol.44, No.2, pp.203 217, February

1996.

[112] C. S. Chang, Performance guarantees in communication networks, Springer

Verlag, 2000.

[113] ITU-T Recommendation P.59, Artificial Conversational Speech, International

Telecommunications Union, March, 1993.

[114] K. Park, G.Kim, and M.Grovella, “On the relationship between file sizes, trans-

port protocols, and self similar network traffic,” Proceedings of ICNP, 1996.

[115] T. E. Stern and A. I. Elwalid, “Analysis of separable Markov modulated rate

models for information handling systems,” Adv. Appl. Prob., pp.105-139, vol.23,

1991.

[116] S. Mao, S. S. Panwar, and G. Lapiotis, “The effective bandwidth of Markov

Modulated Fluid Process sources with a Generalized Processor Sharing server,”

Proc. IEEE Globecom 2001, vol. 4, pp.2341-2346, November 2001.

[117] M. F. Neuts, Matrix-Geometric solutions in stochastic models: an algorithmic

approach, Dover Publications, Inc., New York, 1994.

[118] G. Latouche and V. Ramaswami, Introduction to matrix analytic methods in

stochastic modeling, American Statistical Association and the Society for Indus-

trial and Applied Mathematics, Philadelphia, 1999.

Page 188: C:/Documents and Settings/Shiwen Mao/My Documents/mydoc ...szm0001/ShiwenMao_dissertation.pdf · viii previous work, our scheme is more realistic, and is easier to implement. Depending

171

[119] B. Sengupta, “Markov processes whose steady state distribution is matrix-

exponential with an application to the GI/PH/1 queue,” Adv. Appl. Prob., 21,

pp.159-180, 1989.

[120] V. Ramaswami, “Matrix analytic methods for stochastic fluid flows,” ITC-16

Proc., pp. 1019 1030, 1999.

[121] D. Anick, D. Mitra, and M. M. Sondhi, ”Stochastic theory of a data-handling

system with multiple sources,” Bell Syst. Tech. J., vol.61, no.8, pp. 1871-1894,

October 1982.

[122] R. Morris and D.Lin, “Variance of aggregated web traffic,” Proc. of IEEE IN-

FOCOM 2000, vol. 1, pp.360-366, March 2000.

[123] M. F. Neuts, “The caudal characteristic curve of queues,” Adv. Appl. Prob., 18,

pp.221-254, 1986.

[124] S. Mao and S. S. Panwar, “Analysis of Generalized Processor Sharing systems

using Matrix Analytic Methods,” in the Proceedings of the 36th Annual Conference

on Information Sciences and Systems (CISS 2002), Princeton, March 2002.

[125] G. D. Lapiotis, “Stochastic analysis of joint buffer management and service

scheduling in high-speed network nodes,” PhD Dissertation, Polytechnic Univer-

sity, Brooklyn, NY.

[126] R. L. Cruz, “A calculus for network delay, Part I: Network elements in isola-

tion,” IEEE Trans. Inform. Theory, vol.37, pp:114-131, Jan. 1991.

[127] R. L. Cruz, “A calculus for network delay, Part II: Network analysis,” IEEE

Trans. Inform. Theory, vol. 37, pp:132-141, Jan. 1991.

[128] M. Reisslein, K. W. Ross, and S. Rajagopal, “A framework for Guaranteeing

statistical QoS,”, IEEE/ACM Transaction on Networking, vol.10, issue 1, pp.27-

42, February 2002.

Page 189: C:/Documents and Settings/Shiwen Mao/My Documents/mydoc ...szm0001/ShiwenMao_dissertation.pdf · viii previous work, our scheme is more realistic, and is easier to implement. Depending

172

[129] M .Reisslein, K. W. Ross, and S. Rajagopal, “Guaranteeing statistical QoS

to regulated traffic: The multiple node case,” In Proceedings of the 37th IEEE

Conference on Decision and Control (CDC), Tempa, FL, December 1998.

[130] C.-S. Chang, “Stability, queue length, and delay of deterministic and stochastic

queueing networks,” IEEE Trans. Automat. Contr., vol.39, pp:913-931, May 1994.

[131] H. Sariowan, R. L. Cruz, and G. C. Polyzos, “Scheduling for quality of service

guarantees via service curves,” In Proceedings of the ICCCN’95, pp.512-520, Las

Vegas, September, 1995.

[132] E. W. Knightly, “H-BIND: a new approach to providing statistical performance

guarantees to VBR traffic,” in Proceedings of IEEE Infocom’96, 1996.

[133] D. E. Wrege, E. W. Knightly, H. Zhang, and J. Liebeherr, “Deterministic de-

lay bounds for VBR video in packet -switching networks: Fundamental limits

and practical trade-offs,” IEEE/ACM Transactions on Networking, vol.4, no.3,

pp:352-362, June 1996.

[134] S. Rajagopal, M. Reisslein, and K. W. Ross, “Packet multiplexers with adver-

sarial regulated traffic,” in Proc. INFOCOM’98, pp.347-355, 1998.

[135] D. Starobinski and M. Sidi, “Stochastically bounded burstiness for communi-

cation networks,” IEEE Trans. Inform. Theory, vol. 46, no. 1, pp: 206-121, Jan.

2000.

[136] O. Yaron and M. Sidi, “Performance and stability of communication networks

via robust exponential bounds,” IEEE/ACM Trans. Networking, vol. 1, pp:372-

385, 1993.

[137] Z.-L. Zhang, D. Towsley, and J. Kurose, “Statistical analysis of the general-

ized processor sharing scheduling discipline,” IEEE Journal of Selected Areas in

Communications, vol.13, no.6, pp.1071-1080, August 1995.

Page 190: C:/Documents and Settings/Shiwen Mao/My Documents/mydoc ...szm0001/ShiwenMao_dissertation.pdf · viii previous work, our scheme is more realistic, and is easier to implement. Depending

173

[138] X. Yu, L. Thng, and Y. Jiang, “Generalized processor sharing with long-range

dependent traffic input,” in Proceedings of 9th International Symposium on Mod-

eling, Analysis and Simulation of Computer and Telecommunication Systems,

pp:224-232, 2001.

[139] J. Kurose, “On computing per-session performance bounds in high-speed multi-

hop computer networks,” in Proceedings of ACM SIGMETRICS’92, 1992.

[140] E. W. Knightly, “Second moment resource allocation in multi-service networks,”

in Proceedings of ACM SIGMETRICS’97, pp:181-191, 1997.

[141] E. W. Knightly, “Enforceable quality of service guarantees for bursty traffic

streams,” in Proceedings of IEEE Infocom’98, 1998.

[142] T. Wu and E. W. Knightly, “Enforceable and efficient service provisioning,”

Computer Communications Journal: Special Issue on Multimedia Communica-

tions over the Internet, 23:(14-15), pp:1377-1388, August 2000.

[143] R. R. Boorstyn, A. Burchard, J. Liebeherr, and C. Oottamakorn, “Statistic

service assurances for traffic scheduling algorithms,” IEEE Journal on Selected

Areas in Communications, vol. 18, Issue. 12, pp:2651-2664, Dec. 2000.

[144] A. Papoulis and S. U. Pillai Probability, random variables, and stochastic pro-

cesses, Fourth Edition, McGraw-Hill, 2002.

[145] N. L. S. Fonseca, G. S. Mayor, and C. A. V. Neto, “On the equivalent bandwidth

of self-similar sources,” ACM Trans. Modeling and Computer Simulation, vol.10,

no.2, pp:104-124, April 2000.

[146] C. A. V. Melo and N. L. S. Fonseca, “An envelope process for multifractal traffic

modeling,” Technical Report, IC-03-10, Instituto De Computacao, Universidade

Estadual De Campinas, April 2003.

[147] R. L. Cruz, “Quality of service guarantees in virtual circuit switched networks,”

IEEE Journal on Selected Areas in Communications, vol.13, no.6, August 1995.

Page 191: C:/Documents and Settings/Shiwen Mao/My Documents/mydoc ...szm0001/ShiwenMao_dissertation.pdf · viii previous work, our scheme is more realistic, and is easier to implement. Depending

174

[148] R. Agrawal, R. L. Cruz, C. Okino, and R. Rajan, “Performance bounds for flow

control protocols,” IEEE/ACM Transactions on Networking, vol.7, no.3, June

1999.

[149] J.-Y. Qiu and E. W. Knightly, “Inter-class resource sharing using statistical

service envelopes,” Proceedings of INFOCOM’99, April 1999.

[150] J. Liebeherr, S. D. Patek, and A. Burchard, “A calculus for end-to-end statisti-

cal service guarantees,” Technical Report, University of Virginia, Department of

Computer Science, CS-2001-19, August 2001.

[151] J.-Y. Le Boudec and P. Thiran, Network calculus: Theory of deterministic

queueing systems for the Internet, Springer Verlog, January 2002.

[152] C.-S. Chang and J. A. Thomas, “Effective bandwidth in high-speed digital

networks,” IEEE Journal on Selected Areas in Communications, vol.13, no.6,

August 1995.

[153] R. M. Loynes, “The stability of a queue with non-independent inter-arrival and

service times,” Proc. Cambridge Philos. Soc., vol.58,pp.497-520, 1962.

[154] A. K. Parekh and R. G. Gallager, “A generalized processor sharing approach to

flow control in integrated services networks: The single node case,” IEEE/ACM

Trans. Networking, vol.1-3, pp:344-357, June 1993.

[155] A. K. Parekh and R. G. Gallager, “A generalized processor sharing approach to

flow control in integrated services networks: The multiple node case,” IEEE/ACM

Trans. Networking, vol.2-2, pp:137-150, April 1994.

[156] A. Elwalid, D. Mitra, and R. H. Wentworth, “A new approach for allocating

buffers and bandwidth to heterogeneous regulated traffic in an ATM node,” IEEE

Journal on Selected Areas in Communications, vol.13, no.6, pp:1115-1127, August

1995.

Page 192: C:/Documents and Settings/Shiwen Mao/My Documents/mydoc ...szm0001/ShiwenMao_dissertation.pdf · viii previous work, our scheme is more realistic, and is easier to implement. Depending

175

[157] F. Lo Presti, Z. Zhang, J. Kurose, and D. Towsley, “Source time scale and

optimal buffer/ bandwidth trade-off for heterogeneous regulated traffic in an net-

work node,” in IEEE/ACM Transaction on Networking, vol.7, no.4, pp:490-501,

August 1999.

[158] L. Georgiadis, R. Guerin, V. Peris, and K. N. Sivarajan, “Efficient network

QoS provisioning based on per node traffic shaping,” IEEE/ACM Transaction on

Networking, 4(4):482-501, August 1996.

[159] J. Choe and N. B. Shroff, “A central limit theorem based approach to analyze

queue behavior in ATM networks,” IEEE/ACM Transaction on Networking, vol.6,

no.5, pp:659-671, October 1998.

[160] M. Montgomery and G. De Veciana, “On the relevance of time scales in perfor-

mance oriented traffic characterization,” in Proc. IEEE INFOCOM, 1996.

[161] E. Knightly and N. Shroff, “Admission Control for Statistical QoS: Theory and

Practice,” IEEE Network, 13(2):20-29, March 1999.

[162] I. Norros, “A storage model with self-similar input,” Queueing Systems, vol.16,

pp:387-396, 1994.

[163] S. M. Ross, Stochastic Processes, 2nd Edition, John Wiley & Sons, 1996.

[164] J. Qiu and E. W. Knightly, ”Measurement-based admission control with aggre-

gate traffic envelopes,” IEEE/ACM Trans. Networking, vol.9, no.2, April 2001.

[165] N. L. S. Fonseca, F. M. Pereira, and D. S. Arantes, “On the computation of

end-to-end delay in a network of GPS servers with long range dependent traffic,”

in Proceedings of IEEE ICC’02, April 2002.

[166] I. Norros, “On the use of fractional brownian motion in the theory of connec-

tionless networks,” IEEE Journal on Selected Areas in Communications, vol.13,

no.6, August 1993.

Page 193: C:/Documents and Settings/Shiwen Mao/My Documents/mydoc ...szm0001/ShiwenMao_dissertation.pdf · viii previous work, our scheme is more realistic, and is easier to implement. Depending

176

[167] G. Mayor and J. Silvester, “Time scale analysis of an ATM queueing system

with long-range dependent traffic,” in Proceedings of IEEE Infocom’97, 1997.

[168] A. Elwalid and D. Mitra, “Design of Generalized Processor Sharing schedulers

which statistically multiplex heterogeneous QoS classes,” in Proceedings of INFO-

COM’99, pp.1220-1230, 1999.

[169] K. Kumaran, G. E. Margrave, D. Mitra, and K. R. Stanley, “Novel techniques

for the design and control of generalized processor sharing schedulers for multiple

QoS classes,” in Proceedings of IEEE Infocom 2000, April 2000.

[170] J. Liebeherr, D. Wrege, and D. Ferrari, “Exact admission control for networks

with bounded delay services,” IEEE/ACM Transactions on Networking, 4(6):885-

901, December 1996.

[171] M. Andrews, “Probabilistic end-to-end delay bounds for earliest deadline first

scheduling,” in Proceedings INFOCOM’00, 2000.

Page 194: C:/Documents and Settings/Shiwen Mao/My Documents/mydoc ...szm0001/ShiwenMao_dissertation.pdf · viii previous work, our scheme is more realistic, and is easier to implement. Depending

177

List of Publications

[1] S. S. Panwar, S. Mao, J.-D. Ryoo, and Y. Li, TCP/IP Essentials: A Lab-Based Approach, Cambridge, UK: Cambridge University Press, to appear.

[2] S. Mao and S. S. Panwar, “On Generalized Processor Sharing Systems withComplete Sharing Virtual Partitioning Buffer Management Policy,” underpreparation.

[3] D. Bushmitch, S. Mao, and S. S. Panwar, “Queueing Performance of Real-Time Traffic under Thinning, Striping and Shuffling,” under preparation.

[4] S. Mao, D. Bushmitch, S. Narayanan, and S. S. Panwar, “MRTP: A multi-flow realtime transport protocol,” under preparation.

[5] Dennis Bushmitch, Shiwen Mao, and Shivendra S. Panwar, “The Multi-flowRealtime Transport Protocol,” under preparation, IETF Internet Draft.

[6] S. Mao, and S. S. Panwar, “Supporting realtime multimedia transport usingmultiple paths,” under review.

[7] S. Mao and S. S. Panwar, “Quality of service provisioning and envelopeprocesses: A survey,” under review.

[8] S. Mao, S. Lin, S. S. Panwar, Y. Wang, and E. Celebi, “Video transport overad hoc networks: Multistream coding with multipath transport,” IEEE Jour-nal on Selected Areas in Communications, Special Issue on Recent Advancein Wireless Multimedia, vol.21, no.10, pp.1721-1737, December 2003.

[9] S. Mao, Y. Wang, S. Lin, and S. S. Panwar, “Video transport over ad-hocnetworks with path diversity,” ACM Mobile Computing and CommunicationReview, vol.7, no.1, pp.59-61, January, 2003.

Page 195: C:/Documents and Settings/Shiwen Mao/My Documents/mydoc ...szm0001/ShiwenMao_dissertation.pdf · viii previous work, our scheme is more realistic, and is easier to implement. Depending

178

[10] S. Mao, S. Lin, S. S. Panwar, and Y. Wang, “An ad hoc multipath videotransport testbed,” in proceedings of IEEE Vehicular Technology Conference2003-Fall, Orlando, Florida, October 2003.

[11] S. Mao, D. Bushmitch, S. Narayanan, and S. S. Panwar, “MRTP: A multi-flow realtime transport protocol for ad hoc networks,” in proceedings of IEEEVehicular Technology Conference 2003-Fall, Orlando, Florida, October 2003.

[12] S. Mao, S. Lin, D. Bushmitch, S. Narayanan, S. S. Panwar, Y. Wang, andR. Izmailov, “Real time transport with path diversity,” presented at the2nd New York Metro Area Networking Workshop, Columbia University, NewYork, September 2002.

[13] Y. Wang, S. S. Panwar, S. Lin, and S. Mao, “Wireless video transport usingpath diversity: Multiple description vs. layered coding,” invited paper in theProceedings of the IEEE 2002 International Conference on Image Processing,vol.1, pp.21-24, Rochester, NY, September 2002.

[14] S. Lin, Y. Wang, S. Mao, S. S. Panwar, “Video transport over ad hoc net-works using multiple paths,” invited paper in the Proceedings of the IEEEInternational Symposium on Circuits and Systems, vol.1, pp.57-60, May 2002.

[15] S. Mao and S. S. Panwar, “Analysis of Generalized Processor Sharing sys-tems using Matrix Analytic Methods,” in the Proceedings of the 36th AnnualConference on Information Sciences and Systems, Princeton, March 2002.

[16] S. Mao, M. Karol, and S. S. Panwar, “Simulation study of a 640Gbpsmultistage switch using OPNET,” Technical Report, Avaya Labs-Research,September 2001.

[17] S. Mao, S. S. Panwar, and G. D. Lapiotis, “The effective bandwidth ofMarkov Modulated Fluid Process sources with a Generalized Processor Shar-ing server,” Proceedings of IEEE GLOBECOM, vol.4, pp.2341-2346, Novem-ber 2001.

[18] S. Mao, S. Lin, S. S. Panwar, and Y. Wang, “Reliable transmission of videoover ad-hoc networks using automatic repeat request and multipath trans-port,” Proceedings of the IEEE Vehicular Technology Conference 2001-Fall,vol. 2, pp.615-619, Atlantic City, NJ, October 2001.

Page 196: C:/Documents and Settings/Shiwen Mao/My Documents/mydoc ...szm0001/ShiwenMao_dissertation.pdf · viii previous work, our scheme is more realistic, and is easier to implement. Depending

179

[19] S. Lin, S. Mao, Y. Wang, and S. S. Panwar, “A reference picture selectionscheme for video transmission over ad-hoc networks using multiple paths,”Proceedings of IEEE International Conference on Multimedia and Expo, Au-gust 2001.

[20] G. D. Lapiotis, S. Mao, and S. S. Panwar, ‘GPS analysis of multiple sessionswith application to admission control,” Proceedings of the IEEE Interna-tional Conference on Communications, vol.6, pp.1829-1933, June 2001.

Page 197: C:/Documents and Settings/Shiwen Mao/My Documents/mydoc ...szm0001/ShiwenMao_dissertation.pdf · viii previous work, our scheme is more realistic, and is easier to implement. Depending

180

Acronyms

ACK: Acknowledgement

AODV: Ad Hoc On-demand Distance Vector routing

ARQ: Automatic Repeat Request

BL: Base Layer

CDMA: Code Division Multiple Access

CDN: Content Delivery Network

CSMA/CA: Carrier Sensing Multiple Access/Collision Avoidance

CTS: Critical Time Scale

DCF: Distributed Coordination Function

DCT: Discrete Cosine Transform

DSR: Dynamic Source Routing

EL: Enhancement Layer

FEC: Forward Error Correction

FCFS: First Come First Serve

FH: Frequency Hopping

FIFO: First In First Out

GOB: Group of Blocks

GPS: Generalized Processor Sharing

IETF: Internet Engineering Task Force

LRD: Long Range Dependence

MAC: Multiple Access Control

MCP: Motion Compensated Prediction

MD: Multiple Descriptions

MDC: Multiple Description Coding

MDMC: Multiple Description Motion Compensation

MDSR: Multipath Dynamic Source Routing Protocol

MMFP: Markov Modulated Fluid Process

MPT: Multipath Transport

MRE: Maximum Real Eigenvalue

MRTP: Multi-flow Realtime Transport Protocol

MRTCP: Multi-flow Realtime Transport Control Protocol

Page 198: C:/Documents and Settings/Shiwen Mao/My Documents/mydoc ...szm0001/ShiwenMao_dissertation.pdf · viii previous work, our scheme is more realistic, and is easier to implement. Depending

181

NACK: Negative Acknowledgement

NIST: National Institute of Standards and Technology

ODE: Ordinary Differential Equation

P2P: Peer-to-Peer Network

PSNR: Peak Signal to Noise Ratio

QBD: Quasi Birth Death Process

QCIF: Quater Common Intermediate Format

QoS: Quality of Service

RPS: Reference Picture Selection

RR: Receiver Report

RSVP: Resource Reservation Protocol

RTCP: Realtime Transport Control Protocol

RTP: Realtime Transport Protocol

RTT: Round Trip Time

SCTP: Stream Control Transmission Protocol

SE: Storage Element

SIP: Session Initiation Protocol

SNR: Signal to Noise Ratio

SPT: Single Path Transport

SR: Sender Report

TCP: Transmission Control Protocol

TDMA: Time Division Multiple Access

UDP: User Datagram Protocol

VBR: Variable Bit-Rate

VLC: Variable Length Coding

VoIP: Voice over IP

VRC: Video Redundancy Coding

ZRP: Zone Routing Protocol