A Study of Applications for Optical Circuit-Switched Networks
description
Transcript of A Study of Applications for Optical Circuit-Switched Networks
![Page 1: A Study of Applications for Optical Circuit-Switched Networks](https://reader036.fdocuments.in/reader036/viewer/2022062409/56814ea5550346895dbc50bb/html5/thumbnails/1.jpg)
1
A Study of Applications for
Optical Circuit-Switched Networks
Xiuduan FangMay 1, 2006
Supported by NSF ITR-0312376, NSF EIN-0335190,
and DOE DE-FG02-04ER25640 grants
![Page 2: A Study of Applications for Optical Circuit-Switched Networks](https://reader036.fdocuments.in/reader036/viewer/2022062409/56814ea5550346895dbc50bb/html5/thumbnails/2.jpg)
2
Outline
Introduction CHEETAH Background
― CHEETAH concept and network― CHEETAH end-host software
Analytical Models of GMPLS Networks Application (App) I: Web Transfer App App II: Parallel File Transfers Summary and Conclusions
![Page 3: A Study of Applications for Optical Circuit-Switched Networks](https://reader036.fdocuments.in/reader036/viewer/2022062409/56814ea5550346895dbc50bb/html5/thumbnails/3.jpg)
3
Introduction Many optical connection-oriented (CO)
testbeds― E.g., CANARIE's CA*net 4, UKLight, and CHEETAH― Primarily designed for e-Science apps
Use Generalized Multiprotocol Label Switching (GMPLS)
― Immediate request, call blocking Motivation: extend these GMPLS networks
to million of users Problem Statement
― What apps are well served by GMPLS networks?― Design apps to use GMPLS networks efficiently
![Page 4: A Study of Applications for Optical Circuit-Switched Networks](https://reader036.fdocuments.in/reader036/viewer/2022062409/56814ea5550346895dbc50bb/html5/thumbnails/4.jpg)
4
Circuit-switched High-speed End-to-End Transport ArcHitecture (CHEETAH)
Designed as an “add-on” service to the Internet and leverages the services of the Internet
Optical circuit-switched CHEETAH
network
Optical circuit-switched CHEETAH
network
Packet-switched Internet
Packet-switched Internet
Endhost
NIC I
NIC II
Endhost
NIC I
NIC II
IP router IP router
Ethernet-SONETgateway
Ethernet-SONETgateway
CHEETAH concept
![Page 5: A Study of Applications for Optical Circuit-Switched Networks](https://reader036.fdocuments.in/reader036/viewer/2022062409/56814ea5550346895dbc50bb/html5/thumbnails/5.jpg)
5
CHEETAH Network
zelda4
Sycamore SN16000
1G
ORNL, TN
Atlanta, GA
NC
Direct fibersVLANsMPLS tunnels
mvstu6
UVa
CUNY
zelda5
Sycamore SN16000
zelda3
zelda1
zelda2
OC-192 lambda
MCNCCatalyst
7600
wukongSN16000
UVa Catalyst
4948
NCSUM20
CentuarFastIron
FESX448
WASHAbileneT640
NYCHOPI
Force10
WASHHOPI
Force10
CUNYFoundry
CUNYHost
![Page 6: A Study of Applications for Optical Circuit-Switched Networks](https://reader036.fdocuments.in/reader036/viewer/2022062409/56814ea5550346895dbc50bb/html5/thumbnails/6.jpg)
6
CHEETAH End-host Software
Application
RSVP-TE client
TCP/IPNIC 1
NIC 2
End hostCHEETAH software
Routing decision
C-TCP
OCS clientInternet
CHEETAH network
Application
RSVP-TE client
TCP/IP NIC 1
NIC 2
End hostCHEETAH software
Routing decision
C-TCP
OCS client
OCS: Optical Connectivity ServiceRD: routing decisionRSVP-TE: ReSerVation Protocol-Traffic EngineeringC-TCP: Circuit-TCP
![Page 7: A Study of Applications for Optical Circuit-Switched Networks](https://reader036.fdocuments.in/reader036/viewer/2022062409/56814ea5550346895dbc50bb/html5/thumbnails/7.jpg)
7
Outline
Introduction CHEETAH Background
― CHEETAH concept and network― CHEETAH end-host software
Analytical Models of GMPLS Networks Application (App) I: Web Transfer App App II: Parallel File Transfers Summary and Conclusions
![Page 8: A Study of Applications for Optical Circuit-Switched Networks](https://reader036.fdocuments.in/reader036/viewer/2022062409/56814ea5550346895dbc50bb/html5/thumbnails/8.jpg)
8
Assumptions: ― Call arrival rate, (Poisson process)― Single link― Single class: all apps are of the same type
A link of capacity C; m circuits; per-circuit BW=C/m m is a measure of high-throughput vs. moderate-
throughput For high-throughput (e.g., e-Science apps), m is small
Problem: what apps are suitable for GMPLS networks?
Analytical Models of GMPLS Networks
/1
― Measure of suitability: Call-blocking probability, Pb Link utilization, U
― App properties: Per-circuit BW Call-holding time,
![Page 9: A Study of Applications for Optical Circuit-Switched Networks](https://reader036.fdocuments.in/reader036/viewer/2022062409/56814ea5550346895dbc50bb/html5/thumbnails/9.jpg)
9
BW sharing models
is independent of/1 mC /
…
1
N
Link L, capacity C
…
,
…
1
N
Link L, capacity CRD0
/1 mC / is dependent on
File size distribution:
:shape , k :scale
Two kinds of apps: whether is dependent on /1 mC /
The Erlang-B formula
:crossover file size
![Page 10: A Study of Applications for Optical Circuit-Switched Networks](https://reader036.fdocuments.in/reader036/viewer/2022062409/56814ea5550346895dbc50bb/html5/thumbnails/10.jpg)
10
Numerical Results: is independent of/1 mC /
Two equations, four variables Fix U and m, compute Pb and
![Page 11: A Study of Applications for Optical Circuit-Switched Networks](https://reader036.fdocuments.in/reader036/viewer/2022062409/56814ea5550346895dbc50bb/html5/thumbnails/11.jpg)
11
Numerical Results: is independent of
/1
m=10
Pb=23.62%
Conclusions: to get high U Small m (~10): high Pb, thus book-ahead or call queuing Large m (~1000): high , thus large N Intermediate m (~100): large is preferred
/1 mC /
)/( N/1
![Page 12: A Study of Applications for Optical Circuit-Switched Networks](https://reader036.fdocuments.in/reader036/viewer/2022062409/56814ea5550346895dbc50bb/html5/thumbnails/12.jpg)
12
Conclusions: to get high U Small m (~10): high Pb, thus book-ahead or call
queuing As m increases, N does not increase m=100, to get U>80%, Pb<5%: 6MB< <29MB, thus
Numerical Results: is dependent on , whenmC /
ss 3.2/15.0
/1MBk 25.1,1.1
![Page 13: A Study of Applications for Optical Circuit-Switched Networks](https://reader036.fdocuments.in/reader036/viewer/2022062409/56814ea5550346895dbc50bb/html5/thumbnails/13.jpg)
13
Conclusions for Analysis Ideal apps require BW on the order of
one-hundredth the link capacity as per-circuit rate
Apps where is independent of― long call-holding time is preferred
Apps where is dependent on― need short call-holding time
mC //1
/1 mC /
![Page 14: A Study of Applications for Optical Circuit-Switched Networks](https://reader036.fdocuments.in/reader036/viewer/2022062409/56814ea5550346895dbc50bb/html5/thumbnails/14.jpg)
14
Outline
Introduction CHEETAH Background
― CHEETAH concept and network― CHEETAH end-host software
Analytical Models of GMPLS Networks Application (App) I: Web Transfer App App II: Parallel File Transfers Summary and Conclusions
![Page 15: A Study of Applications for Optical Circuit-Switched Networks](https://reader036.fdocuments.in/reader036/viewer/2022062409/56814ea5550346895dbc50bb/html5/thumbnails/15.jpg)
15
APP I: Web Transfer App on CHEETAH
Why web transfer?― Web-based apps are ubiquitous― Based on the previous analysis, m=100 is
suitable for CHEETAH Consists of a software package WebFT
― Leverages CGI for deployment without modifying web client and web server software
― Integrated with CHEETAH end-host software APIs to allow use of the CHEETAH network in a mode transparent to users
![Page 16: A Study of Applications for Optical Circuit-Switched Networks](https://reader036.fdocuments.in/reader036/viewer/2022062409/56814ea5550346895dbc50bb/html5/thumbnails/16.jpg)
16
Control messages via Internet
WebFT Architecture
Web serverWeb client
Web Server (e.g. Apache)
CGI scripts (download.cgi &
redirection.cgi
URLResponse
WebFT sender
OCS API RD API
RSVP-TE API
C-TCP API
Web Browser(e.g. Mozilla)
WebFT receiver
RSVP-TE API
C-TCP API Data transfers via a circuit
OCS daemon
RD daemon
RSVP-TE daemon
RSVP-TE daemon
Cheetah end-host software APIsand daemons
Cheetah end-host software APIsand daemons
![Page 17: A Study of Applications for Optical Circuit-Switched Networks](https://reader036.fdocuments.in/reader036/viewer/2022062409/56814ea5550346895dbc50bb/html5/thumbnails/17.jpg)
17
Experimental Testbed for WebFT
zelda3 and wukong: Dell machines, running Linux FC3 and ext2/3, with RAID-0 SCCI disks
RTT between them: 24.7ms on the Internet path, and 8.6ms for the CHEETAH circuit.
load Apache HTTP server 2.0 on zelda3
CHEETAH Network
CHEETAH Network
InternetInternet
zelda3
NIC I
NIC II
wukong
NIC I
NIC II
IP routers IP routers
NCSUAtlanta, GA
Sycamore SN16000Atlanta, GA
Sycamore SN16000MCNC, NC
![Page 18: A Study of Applications for Optical Circuit-Switched Networks](https://reader036.fdocuments.in/reader036/viewer/2022062409/56814ea5550346895dbc50bb/html5/thumbnails/18.jpg)
18
Experimental Results for WebFT
The web page to test WebFT
Test parameters: ― Test.rm: 1.6 GB, circuit rate: 1 Gbps
Test results― throughput: 680 Mbps, delay: 19 s
![Page 19: A Study of Applications for Optical Circuit-Switched Networks](https://reader036.fdocuments.in/reader036/viewer/2022062409/56814ea5550346895dbc50bb/html5/thumbnails/19.jpg)
19
Outline
Introduction CHEETAH Background
― CHEETAH concept and network― CHEETAH end-host software
Analytical Models of GMPLS Networks Application (App) I: Web Transfer App App II: Parallel File Transfers Summary and Conclusions
![Page 20: A Study of Applications for Optical Circuit-Switched Networks](https://reader036.fdocuments.in/reader036/viewer/2022062409/56814ea5550346895dbc50bb/html5/thumbnails/20.jpg)
20
APP II: Parallel File Transfers on CHEETAH
Motivation: E-Science projects need to share large volumes of data (TB or PB)
Goal: achieve multi-Gb/s throughput Two factors limit throughput
― TCP’s congestion-control algorithm― End-host limitations
Solutions to relieve end-host limitations
― Single-host solution― Cluster solution, which has two variations
General case: non-split source file Special case: split source file
![Page 21: A Study of Applications for Optical Circuit-Switched Networks](https://reader036.fdocuments.in/reader036/viewer/2022062409/56814ea5550346895dbc50bb/html5/thumbnails/21.jpg)
21
General-Case Cluster Solution
OriginalSource
Host 1
Host i
Host n
split
Host 1’
Host i’
Host n’
OriginalSink
transfer
transfer
transfer
assemble
…
……
… ……
![Page 22: A Study of Applications for Optical Circuit-Switched Networks](https://reader036.fdocuments.in/reader036/viewer/2022062409/56814ea5550346895dbc50bb/html5/thumbnails/22.jpg)
22
Software Tools: GridFTP and PVFS2
GridFTP: a data-transfer protocol on the Grid
― Extends FTP by adding features for partial file transfer, multi-streaming and striping
― We mainly use the GridFTP striped transfer feature.
PVFS: Parallel Virtual File System― An open source implementation of a parallel
file system― Stripes a file across multiple I/O servers like
RAID0― A second version: PVFS2
![Page 23: A Study of Applications for Optical Circuit-Switched Networks](https://reader036.fdocuments.in/reader036/viewer/2022062409/56814ea5550346895dbc50bb/html5/thumbnails/23.jpg)
23
SPOR <host-port pairs>
response to SPOR
GridFTP server
globus-url-copy
GridFTP striped transfer
Block 1
Block n+1
…
Block 1
Block n+1
…
data node R1
data node Rn
Parallel File System
GridFTP server
…
Block 1
Block n+1
…
Block 1
Block n+1
…
data node S1
data node Sn
Parallel File System
…receiving front end sending front end
SPAS
a list
of host-
port pair
s
Sending data nodes initiate data connections to receiving nodes
![Page 24: A Study of Applications for Optical Circuit-Switched Networks](https://reader036.fdocuments.in/reader036/viewer/2022062409/56814ea5550346895dbc50bb/html5/thumbnails/24.jpg)
24
General-Case Cluster Solution:DesignSteps
Approach
Pros. Cons.
Splitting &Assemblin
g
GridFTP partial file transfer
Wastes disk space,Performance overhead
Socket program
Avoids wasting disk space
Performance overhead
pvfs2-cpAvoids wasting disk space
Transferring
GridFTP partial file transfer
Many independent transfers incurring much overhead to set up and release connections
GridFTP striped transfer
A single file transfer
![Page 25: A Study of Applications for Optical Circuit-Switched Networks](https://reader036.fdocuments.in/reader036/viewer/2022062409/56814ea5550346895dbc50bb/html5/thumbnails/25.jpg)
25
General-Case Cluster Solution:Implementation
To get a high throughput, we need to make data nodes responsible for data blocks in their local disks
Block 1
Block n+1
…
Block 1
Block n+1
…
data node R1
data node Rn
PVFS2
Block 1
Block n+1
…
Block 1
Block n+1
…
data node S1
data node Sn
PVFS2
… …― Make PVFS2 and GridFTP have the same
stripe pattern Problems:
― PVFS2 1.0.1 does not provide a utility to inspect data distribution
― Data connections between sending and receiving nodes are random
![Page 26: A Study of Applications for Optical Circuit-Switched Networks](https://reader036.fdocuments.in/reader036/viewer/2022062409/56814ea5550346895dbc50bb/html5/thumbnails/26.jpg)
26
Random data connections
Block 1
Block n+1
…
Block 1
Block n+1
…
data node R1
data node Rn
PVFS2
Block 1
Block n+1
…
Block 1
Block n+1
…
data node S1
data node Sn
PVFS2
… …
![Page 27: A Study of Applications for Optical Circuit-Switched Networks](https://reader036.fdocuments.in/reader036/viewer/2022062409/56814ea5550346895dbc50bb/html5/thumbnails/27.jpg)
27
Random data connections
Block 1
Block n+1
…
Block 1
Block n+1
…
data node R1
data node Rn
PVFS2
Block 1
Block n+1
…
Block 1
Block n+1
…
data node S1
data node Sn
PVFS2
… …
![Page 28: A Study of Applications for Optical Circuit-Switched Networks](https://reader036.fdocuments.in/reader036/viewer/2022062409/56814ea5550346895dbc50bb/html5/thumbnails/28.jpg)
28
Implementation - Modifications to PVFS2
Goal: know a priori how a file is striped in PVFS2 Use strace command to trace systems calls
called by pvfs2-cp ― Pvfs2-fs-dump gives the (non-deterministic) I/O server
order of file distribution― Pvfs2-cp ignores the –s option for configuring stripe size
Modify PVFS2 code― For load balance, PVFS2 stripes files starting with a
random server: jitter = (rand() % num_io_servers); ― Set jitter = -1 to get a fixed order of data distribution― Change the default stripe size (original: 64KBytes)
![Page 29: A Study of Applications for Optical Circuit-Switched Networks](https://reader036.fdocuments.in/reader036/viewer/2022062409/56814ea5550346895dbc50bb/html5/thumbnails/29.jpg)
29
Implementation - Modifications to GridFTP Goal: use a deterministic matching
sequence between sending and receiving data nodes Method: modify the implementation of SPAS and SPOR commands
― SPAS: sort the list of host-port pairs based on the IP-address order for receiving data nodes
― SPOR: request sending data nodes to initiate data connections sequentially to receiving data nodes
![Page 30: A Study of Applications for Optical Circuit-Switched Networks](https://reader036.fdocuments.in/reader036/viewer/2022062409/56814ea5550346895dbc50bb/html5/thumbnails/30.jpg)
30
Experimental Results
Conducted on a 22-node cluster, sunfire Reduced network-and-disk contention Performance of PVFS2 implementation
was poor
![Page 31: A Study of Applications for Optical Circuit-Switched Networks](https://reader036.fdocuments.in/reader036/viewer/2022062409/56814ea5550346895dbc50bb/html5/thumbnails/31.jpg)
31
Summary and Conclusions Analytical Models of GMPLS Networks
― Ideal apps require BW on the order of one-hundredth the link capacity as per-circuit rate
Application I: Web Transfer Application― provided deterministic data services to
CHEETAH clients on dedicated end-to-end circuits
― No modifications to the web client and web server software by leveraging CGI
Application II: Parallel File Transfers― Implemented a general-case cluster solution
by using PVFS2 and GridFTP striped transfer ― Modified PVFS2 and GridFTP code to reduce
network-and-disk contention
![Page 32: A Study of Applications for Optical Circuit-Switched Networks](https://reader036.fdocuments.in/reader036/viewer/2022062409/56814ea5550346895dbc50bb/html5/thumbnails/32.jpg)
32
Publication Lists
M. Veeraraghavan, X. Fang, and X. Zheng, On the suitability of applications for GMPLS networks, submitted to IEEE Globecom2006
X. Fang, X. Zheng, and M. Veeraraghavan, Improving web performance through new networking technologies, IEEE ICIW'06, February 23-25, 2006 Guadeloupe, French Caribbean
![Page 33: A Study of Applications for Optical Circuit-Switched Networks](https://reader036.fdocuments.in/reader036/viewer/2022062409/56814ea5550346895dbc50bb/html5/thumbnails/33.jpg)
33
Future Work Analytical Models of GMPLS Networks
― Multi-class― Multiple links and network models
Application I: Web Transfer Application― Design a Web partial CO transfer to enable
non-CHEETAH hosts to use CHEETAH― Connect multiple CO networks to further
reduce RTT Application II: Parallel File Transfers
― Test the general-case cluster solution on CHEETAH
― Work on PVFS2 or try GPFS to get a high I/O throughput
![Page 34: A Study of Applications for Optical Circuit-Switched Networks](https://reader036.fdocuments.in/reader036/viewer/2022062409/56814ea5550346895dbc50bb/html5/thumbnails/34.jpg)
34
A Classification of Networks that Reflects Sharing Modes
![Page 35: A Study of Applications for Optical Circuit-Switched Networks](https://reader036.fdocuments.in/reader036/viewer/2022062409/56814ea5550346895dbc50bb/html5/thumbnails/35.jpg)
35
The client can be reached via the CHEETAH network (OCS)
Request a CHEETAH circuit (Routing Decision)
Set up a circuit (RSVP_TE client)
Send the file via C-TCP
Release the circuit (RSVP_TE client)
Yes
Yes
Succeed
No
No
Fail
Return Success Return Failure
The flow chart for the WebFT sender
![Page 36: A Study of Applications for Optical Circuit-Switched Networks](https://reader036.fdocuments.in/reader036/viewer/2022062409/56814ea5550346895dbc50bb/html5/thumbnails/36.jpg)
36
The WebFT Receiver Integrates with the CHEETAH end-host
software modules similar to the WebFT sender.
Runs as a daemon in the background on the client host to avoid manual intervention.
Also provides the WebFT sender a desired circuit rate.
![Page 37: A Study of Applications for Optical Circuit-Switched Networks](https://reader036.fdocuments.in/reader036/viewer/2022062409/56814ea5550346895dbc50bb/html5/thumbnails/37.jpg)
37
Experimental Results for WebFT
![Page 38: A Study of Applications for Optical Circuit-Switched Networks](https://reader036.fdocuments.in/reader036/viewer/2022062409/56814ea5550346895dbc50bb/html5/thumbnails/38.jpg)
38
PVFS2 Architecture
![Page 39: A Study of Applications for Optical Circuit-Switched Networks](https://reader036.fdocuments.in/reader036/viewer/2022062409/56814ea5550346895dbc50bb/html5/thumbnails/39.jpg)
39
Experimental Configuration Configuration of PVFS2 I/O servers
― The 1st PVFS2: sunfire1 through sunfire5― The 2nd PVFS2: sunfire10, and sunfire6 through 9
Configuration of GridFTP servers― Sending front end: sunfire1 with data nodes sunfire1
through sunfire5― Receiving front end: sunfire10 with data nodes
sunfire10, sunfire6 through sunfire9 GridFTP striped transfer
globus-url-copy -vb –dbg -stripe ftp://sunfire1:50001/pvfs2/test_1G
ftp://sunfire10:50002/pvfs2/test_1G1 2>dbg1.txt
![Page 40: A Study of Applications for Optical Circuit-Switched Networks](https://reader036.fdocuments.in/reader036/viewer/2022062409/56814ea5550346895dbc50bb/html5/thumbnails/40.jpg)
40
Four Conditions to Avoid Unnecessary Network-and-disk Contention
Know a priori how data are striped in PVFS2
PVFS2 I/O servers and GridFTP servers run on the same hosts
GridFTP stripes data across data nodes in the same sequence as PVFS2 does across PVFS2 I/O servers
GridFTP and PVFS2 have the same stripe size
![Page 41: A Study of Applications for Optical Circuit-Switched Networks](https://reader036.fdocuments.in/reader036/viewer/2022062409/56814ea5550346895dbc50bb/html5/thumbnails/41.jpg)
41
![Page 42: A Study of Applications for Optical Circuit-Switched Networks](https://reader036.fdocuments.in/reader036/viewer/2022062409/56814ea5550346895dbc50bb/html5/thumbnails/42.jpg)
42
The Specific Cluster Solution for TSI
Dell 5424
.
.
.
zelda1
zelda2
zelda5
zelda4
zelda3
compute-0-0
compute-0-1
compute-0-4
compute-0-3
compute-0-2
compute-0-19
controller-0(rudi)
disk-0-0
disk-3-0
disk-2-0
disk-1-0
monitoring host
disk-4-0
controller-1(orbitty)
orbitty at NCSU zelda at ORNL
Dell 5224
CHEETAH LAN
X1E at ORNL
X1E
![Page 43: A Study of Applications for Optical Circuit-Switched Networks](https://reader036.fdocuments.in/reader036/viewer/2022062409/56814ea5550346895dbc50bb/html5/thumbnails/43.jpg)
43
Numerical Results for is dependent on/1 mC /
Conclusions: Large m (~1000): does not increase N