Post on 13-Dec-2015
TeraPaths: A QoS Enabled Collaborative Data Sharing Infrastructure for Petascale
Computing Research
The TeraPaths Project TeamThe TeraPaths Project Team
CHEP 06CHEP 06
2
The TeraPaths Project Team
Scott Bradley, BNLScott Bradley, BNL
Frank Burstein, BNLFrank Burstein, BNL
Les Cottrell, SLACLes Cottrell, SLAC
Bruce Gibbard, BNLBruce Gibbard, BNL
Dimitrios Katramatos, BNLDimitrios Katramatos, BNL
Yee-Ting Li, SLACYee-Ting Li, SLAC
Shawn McKee, U. MichiganShawn McKee, U. Michigan
Razvan Popescu, BNLRazvan Popescu, BNL
David Stampf, BNLDavid Stampf, BNL
Dantong Yu, BNLDantong Yu, BNL
3
Outline
IntroductionIntroduction
The TeraPaths projectThe TeraPaths project
The TeraPaths system architecture The TeraPaths system architecture
Experimental deployment and testingExperimental deployment and testing
Future workFuture work
4
Introduction
The problem: The problem: support efficient/reliable/predictable peta-scale support efficient/reliable/predictable peta-scale
data movement in modern high-speed networksdata movement in modern high-speed networks Multiple data flows with varying priority
Default “best effort” network behavior can cause performance and
service disruption problems
Solution:Solution: enhance network functionality with QoS features to enhance network functionality with QoS features to
allow prioritization and protection of data flowsallow prioritization and protection of data flows
5
Tier 1 Tier 1 site
Online System CERNCERN
Tier 1 site BNLBNL
Tier 3 site
Workstations
~GBps
100-1000 Mbps
~PBps
~10-40 Gbps
~10 Gbps
Tier 0+1
Tier 2
e.g. ATLAS Data Distribution
Tier 2 site Tier 2 site Tier 2 site
Tier 3
Tier 4
ATLAS experiment
~2.5-10 Gbps
Tier 3 site Tier 3 site UMichUMich
muon calibration
6
The QoS Arsenal
IntServIntServ RSVP: end-to-end, individual flow-based QoS
DiffServDiffServ Per-packet QoS marking
IP precedence (6+2 classes of service)
DSCP (64 classes of service)
MPLS/GMPLSMPLS/GMPLS Uses RSVP-TE
QoS compatible
Virtual tunnels, constraint-based routing, policy-based routing
7
Prioritized vs. Best Effort Traffic
Network QoS with Three Classes: Best Effort, Class 4 and EF
0
200
400
600
800
1000
1200
0 100 200 300 400 500 600 700 800 900 1000
Time (Seconds)
Uti
lized
Ban
dw
idth
(M
bit
/sec
on
d)
Best Effort Class 4 Express Forwarding TOTAL Wire Speed
8
The TeraPaths project investigates the integration and use of LAN The TeraPaths project investigates the integration and use of LAN QoS and MPLS/GMPLS-based differentiated network services in the QoS and MPLS/GMPLS-based differentiated network services in the ATLAS data intensive distributed computing environment in order to ATLAS data intensive distributed computing environment in order to manage the network as a manage the network as a critical resourcecritical resource
DOE: The collaboration includes BNL and the University of Michigan, DOE: The collaboration includes BNL and the University of Michigan, as well as OSCARS (ESnet), Lambdaas well as OSCARS (ESnet), Lambda Station (FNAL), and DWMI Station (FNAL), and DWMI (SLAC)(SLAC)
NSF: BNL participates in UltraLight to provide the network advances NSF: BNL participates in UltraLight to provide the network advances required in enabling petabyte-scale analysis of globally distributed required in enabling petabyte-scale analysis of globally distributed data data
NSF: BNL participates in a new network initiative: PLaNetS (Physics NSF: BNL participates in a new network initiative: PLaNetS (Physics Lambda Network System ), led by CalTechLambda Network System ), led by CalTech
The TeraPaths Project
9
BNL Site Infrastructure
LAN/MPLS
TeraPathsresource manager
MPLS requests
traffic identification:addresses, port #, DSCP bits
grid AAABandw
idth
Requests &
Releases
OSCARS
ingress / egress
LA
N Q
oS
M10
data transfer management
monitoring
GridFtp & dCache/SRM
SE
networkusagepolicy
ESnet
remoteTeraPaths
Remote LAN QoS requests
10
Envisioned Overall Architecture
TeraPaths
TeraPaths
TeraPaths
TeraPaths
Site A
Site B
Site C
Site D
WAN 1
WAN 2
WAN 3
service invocation
data flow
peering
11
Automate MPLS/LAN QoS Setup
QoS reservation and network configuration system for data flowsQoS reservation and network configuration system for data flows Access to QoS reservations:
Manually,through interactive web interface Manually,through interactive web interface From a program, through APIsFrom a program, through APIs
Compatible with a variety of networking components
Cooperation with WAN providers and remote LAN sites
Access Control and Accounting
System monitoring
Design goal:Design goal: enableenable the reservation of end-to-end network the reservation of end-to-end network
resources to assure a specified “Quality of Service”resources to assure a specified “Quality of Service” User requests minimum bandwidth, start time, and duration
System either grants request or makes a “counter offer”
Network is setup end-to-end with one user request
12
TeraPaths System Architecture
Site A (initiator) Site B (remote)
WAN
web services web services
WAN monitoring
WAN web services
hardware drivershardware drivers
Web page
APIs
Cmd line
QoS requests
user manager
scheduler
site monitor
…
router manager
user manager
scheduler
site monitor
…
router manager
13
TeraPaths Web Services
TeraPaths modules implemented as “web services”TeraPaths modules implemented as “web services” Each network device (router/switch) is accessible/programmable from at
least one management node Site management node maintains reservation etc. databases and distributes
network programming by invoking web services on subordinate management nodes
Remote requests to/from other sites invoke corresponding web services (destination site’s TeraPaths or WAN provider’s)
Web services benefitsWeb services benefits Standardized, reliable, and robust environment Implemented in Java and completely portable Accessible via web clients and/or APIs Compatible and easily portable into Grid services and the Web Services
Resource Framework (WSRF in GT4)
14
TeraPaths Web Services Structure
AAAModule(AAA)
RemoteNegotiation
Module(RNM)
Network
ProgrammingModule (NPM)
AdvanceReservation
Module (ARM)
HardwareProgramming
Module(HPM)
HardwareProgramming
Module(HPM)
HardwareProgramming
Module(HPM)
RemoteRequestModule(RRM)
Network
ConfigurationModule (NCM)
DiffServModule(DSM)
Route
PlanningModule(RPM)
MPLSModule(MSM)
WebInterface
…
APIs
future capability
RemoteInvocations
TeraPaths
15
Site Bandwidth Partitioning Scheme
Minimum Best Effort traffic
Dynamic bandwidth allocationShared dynamic class(es) Dynamic microflow policing
Mark packets within a class using DSCP bits, police at ingress, trust DSCP bits downstream
Dedicated static classes Aggregate flow policing
Shared static classes Aggregate and microflow policing
16
Route Planning with MPLS
WAN
WAN monitoring
WAN web services
TeraPaths
TeraPaths
site m
onito
ring
site
mon
itor
ing
(future capability)
17
Experimental Setup
Full-featured LAN QoS simulation Full-featured LAN QoS simulation testbed using a private network testbed using a private network environment: environment: Two Cisco switches (same models
as production hardware) interconnected with 1Gb link
Two managing nodes, one per switch
Four host nodes, two per switch All nodes have dual 1Gb Ethernet
ports, also connected to BNL campus network
Managing nodes run web services, database servers, have exclusive access to switches
Demo of prototype TeraPaths Demo of prototype TeraPaths functionality given at SC’05 functionality given at SC’05
18
Acquired Experience
Enabled, tested, and verified LAN QoS inside BNL campus networkEnabled, tested, and verified LAN QoS inside BNL campus network
Tested and verified MPLS paths between BNL and LBL, SLAC (Network Tested and verified MPLS paths between BNL and LBL, SLAC (Network Monitoring Project), FNAL, also MPLS/QoS path between BNL and UM Monitoring Project), FNAL, also MPLS/QoS path between BNL and UM for SC’05for SC’05
Integrated LAN QoS with MPLS paths reserved with OSCARSIntegrated LAN QoS with MPLS paths reserved with OSCARS
Installed DWMI network monitoring toolsInstalled DWMI network monitoring tools
Determined effectiveness of OSCARS in guaranteeing and policing Determined effectiveness of OSCARS in guaranteeing and policing
bandwidth reservations on production ESnet paths and its effect on bandwidth reservations on production ESnet paths and its effect on
improving jitter for applications requiring predictable delays improving jitter for applications requiring predictable delays
http://www-iepm.slac.stanford.edu/dwmi/oscars/http://www-iepm.slac.stanford.edu/dwmi/oscars/
Examined impact of prioritized traffic on overall network performance and Examined impact of prioritized traffic on overall network performance and
the effectiveness and efficiency of MPLS/LAN QoSthe effectiveness and efficiency of MPLS/LAN QoS
19
Simulated (testbed) and Actual Traffic
BNL to Umich. – 2 bbcp dtd xfers with iperf background traffic through ESnet MPLS tunnel
Testbed demo – competing iperf streams
20
In Progress / Future Work
Develop and deploy remote negotiation/response, etc. services to fully Develop and deploy remote negotiation/response, etc. services to fully
automate end-to-end QoS establishment across multiple network domainsautomate end-to-end QoS establishment across multiple network domains
Dynamically configure and partition QoS-enabled paths to meet time-Dynamically configure and partition QoS-enabled paths to meet time-
constrained network requirementsconstrained network requirements
Develop site-level network resource manager for multiple VOs vying for Develop site-level network resource manager for multiple VOs vying for
limited WAN resourceslimited WAN resources
Support dynamic bandwidth/routing adjustments based on resource usage Support dynamic bandwidth/routing adjustments based on resource usage
policies and network monitoring data (provided by DWMI)policies and network monitoring data (provided by DWMI)
Integrate with software from other network projects: OSCARS, lambda Integrate with software from other network projects: OSCARS, lambda
station, and DWMIstation, and DWMI
Further goal: widen deployment of QoS capabilities to tier 1 and tier 2 sites Further goal: widen deployment of QoS capabilities to tier 1 and tier 2 sites
and create services to be honored/adopted by CERN ATLAS/LHC tier 0and create services to be honored/adopted by CERN ATLAS/LHC tier 0