ESnet Update
Transcript of ESnet Update
Supporting Advanced Scientific Computing Research • Basic Energy Sciences • Biological and Environmental Research • Fusion Energy
Sciences • High Energy Physics • Nuclear Physics
Steve Cotter, Chin Guok, Joe Metzger, Bill Johnston
ESnet Update
• Network Update • OSCARS • perfSONAR • Federated Trust
Agenda
LBL
SLAC JLAB
PPPL
Ames
ANL
StarLight
MAN LAN (32 A of A)
PNNL
BNL
ORNL
FNL LLNL
LANL
GA
Yucca
Bechtel-NV
IARC
INL
NSTEC
Pantex
SNLA
DOE-ALB Allied Signal
KCP
SRS
NREL
DOE
NETL NNSA
ARM ORAU
OSTI
NOAA
ESnet4 – Jan 2009
IP router
Lab
Optical node
SDN router Lab Link
MAN NLR 10G 20G SDN
SDN IP
Peering Link
SINet (Japan) Russia (BINP)
CERN/LHCOPN (USLHCnet:
DOE+CERN funded) GÉANT - France, Germany, Italy, UK, etc
CA*net4 France GLORIAD (Russia, China) Korea (Kreonet2
MREN StarTap Taiwan (TANet2, ASCGNet)
AMPATH CLARA
(S. America)
CUDI (S. America)
Japan (SINet) Australia (AARNet) Canada (CA*net4 Taiwan (TANet2) Singaren Transpac2 CUDI
KAREN/REANNZ ODN Japan Telecom America NLR-Packetnet Internet2 Korea (Kreonet2)
KAREN / REANNZ Transpac2 Internet2 Korea (kreonet2) SINGAREN Japan (SINet) ODN Japan Telecom America
CA*net4
GÉANT in Vienna (via USLHCNet circuit)
ESnet Confidential
2008 Hub & Wave Install Timeline
JUL AUG SEP OCT NOV DEC
CHIC STAR
NASH ATLA
DENV
PNWG
WASH
NEWY AOFA
CLEV BOST
HOUS KANS
ALBU ELPA
LOSA SDSC
DENV BOIS
Created by Mike O’Connor Mod by JimG
1st set of Juniper MX’s arrived at
LBNL in Mid June
2nd set of Juniper MX’s arrived at
LBNL in Mid Sept
MX480 IP MX960 SDN MX960 SDN
MX960 IP
MX960 SDN
MX960
MX960
MX960
MX960
MX960
MX480
MX480&MX960
MX960 x2
MX480 MX960
MX960 x2
MX960 x2
MX480
MX480
MX960
MX960
2008 HUB Installs 6 MX480’s
19 MX960’s SITE Installs 1 M120 PPPL
1 M10i LASV-HUB
PPPL M120
LASV M10i
NEW HUB INSTALL
EXISTING HUB UPGRADE
1 OC12 LANV-SUNN 1 10GE Internet2 STAR-CHICHIC
14 10GE Internet2 wave installed/split & accepted
19 10GE Internet2 waves installed/split & accepted
6 10GE Internet2 waves installed, split & accepted
1 10G Framenet XC in WASH
2 10GE Internet2 waves installed, split & accepted
1 10GE NLR AOFA-WASH 1 ORNL-NASH 10G IP
1 10GE MAN-LAN #2 1 10GE NRL Temp WASH-STAR 1 10GE CIC-OMNIPop at STAR
Current Hub Count: • 21 Completed: 32 AofA, NEWY*, WASH, ATLA, NASH, CLEV*, BOST*,
CHIC, STAR, KANS*, HOUS*, ELPA*, DENV, ALBU, BOIS*, PNWG, SUNN, SNV(Qwest), LOSA*, SDSC, LASV(SwitchNap)* *9 New Hubs since July 2008
Current Backbone Wave Count: • Internet2 / Level3 Waves:
– IP Waves: 17 new/split for a total of 25 – SDN Waves: 25 new/split for a total of 30
• NLR Waves: – 1 new wave for a total of 5 – 1 temp wave (STAR-WASH) for used during NLR northern path upgrade
Hub & Wave Count
MAN Upgrades Timeline
ESnet Confidential
JAN FEB MAR
MX960
SNV
Created by Mike O’Connor Mod by JimG
MX960
LBNL
MX480
SNLL MX480
LLNL MX480
SLAC
MX960
NERSC
MX960
FNAL MX480
FNAL
MX960
LBNL
MX480
JGI MX960
BNL
MX960
ANL
MX480
BNL
1 10GE FRPG (upgrade from 1GE) DENV
LBL-MR2, SNV-MR1, SNLL-MR2,
LLNL-MR2 & SLAC-MR2 (Completed on or before Jan 27th)
FNAL’s MX Shipping Feb 3rd
Final BAMAN 6509 replacements mid Feb. BNL & ANL install TBD
1 10GE LIMAN#3 AofA-BNL IP up Feb 2nd 1 DF circuit between AofA-NEWY up on Feb 2nd 1 LIMAN#4 NEWY-BNL (Waiting on Lightower to complete early Feb.)
Link speed Description Count 10 GE National Core Waves (inter-hub) 61
10 GE Metropolitan Area Network Circuits (SF Bay Area MAN, Chicago MAN, Long Island MAN)
33
10 GE Circuits to ESnet sites 24
10 GE Circuits to R&E peering points 24
Total 10G WAN circuits 125
10 GE Intra-hub connections (interconnecting ESnet equipment at the network hubs)
78
5 GE GÉANT peering in Vienna, Austria (via USLHCNet and GÉANT circuits)
1
OC-192 SONET special 1
OC-48 SONET ORNL backup circuit
1 GE Mostly small site and commercial peering connections
83
Misc. slower Mostly non-SC sites 64
Active ESnet Links as of 12/2008
ESnet Confidential
Future Installs
• Replace site 6509s (FNAL, ANL & BNL) with MXs – FNAL (MX960 & MX480) shipped on Feb 3rd for site to install – BNL (MX960 & MX480) shipping & Install TBD – ANL (MX960) shipping & Install TBD
• Replace BAMAN 6509s with MXs – LBNL-MR3 (MX960), SNV-MR2 (MX960), LLNL-MR2 (MX480) & SNLL-MR2 (MX480)
completed prior to Jan 22nd – SLAC-MR2 (MX480) Completed on Jan 27th – NERSC-MR2 & JGI-MR2 installs scheduled for Mid Feb.
• Future Circuits installs – New 10 G LIMAN wave & DF AOFA-NEWY End-2-end on Feb 2nd & #4 wave to BNL (Feb) – OC-12 between LASV hub and General Atomic (Feb) – 10 GE between BOST hub to MIT (Feb) – OC-12 between DENV hub and Pantex (TBD) – 1 GE wave in BOIS to INL via IRON (TBD) – 10 GE SDN wave between PNWG hub to PNNL (TBD) – 10 GE SDN wave between NASH hub to ORNL (TBD)
ESnet4 Metro Area Rings
Newport News - Elite
JLab ELITE
ODU
MATP
Wash., DC
Atlanta MAN
180 Peachtree
56 Marietta (SOX)
ORNL (backup)
Wash., DC
Houston
Nashville
San Francisco Bay Area MAN
LBNL
SLAC
JGI
LLNL
SNLL
NERSC
SU
NN
USLHCNet
FNAL
600 W. Chicago
Starlight
ANL
West Chicago MAN Long Island MAN
BNL
111- 8th (NEWY)
32 AoA, NYC
USLHCNet
• LI MAN expansion, BNL diverse entry • FNAL and BNL dual ESnet connection • Upgraded Bay Area MAN switches
To Atlanta
To C
hica
go
Clev.
Wash. DC
MAN LAN (A of A)
To Seattle
111 8th, NYC
BNL
Boston / MIT Tier1 Redundancy: the Northeast
32 A of A, NYC
Tier1 Redundancy: Long Island
To Washington
111 8th, NYC
32 A of A, NYC
BNL
To Boston
IP core node
SDN core node
Notes: 1) There are physically independent paths from R1 to Boston and from R2 to Washington 2) Only fiber paths are shown, wave counts are not indicated 3) The engineering and procurement for this configuration are complete, implementation is underway 4) An architecturally similarly situation is also being implemented for FNAL / Chicago
ESnet IP core ESnet Science Data Network core (N X 10G) ESnet SDN core, NLR links (backup paths) Other R&E supplied link LHC related link MAN link International IP Connections
USLHCNet
Long Island
To CERN
To Chicago
R1
R2
12 Month Circuit Availability 1/2009
0
1000
2000
3000
4000
5000
6000
7000
8000
Out
age
Min
utes
Dec
Nov
Oct
Sep
Aug
Jul
Jun
May
Apr
Mar
Feb
Jan
0
500
1000
1500
2000
LIG
O
100.
000
MS
RI
100.
000
NG
A 10
0.00
0%
NS
TEC
10
0.00
0%
OR
NL
100.
000
GA
99.9
99
JGI
99.9
99
LBL
99.9
99
PN
NL
99.9
99
SLA
C
99.9
99
SN
LL
99.9
99
NE
RS
C
99.9
98
BN
L 99
.996
B
JC
99.9
95
FNA
L 99
.995
LL
NL
99.9
95
DO
E-
NN
SA
SR
S
99.9
94
Y12
99
.994
P
PP
L 99
.992
Yu
cca
99.9
92
MIT
99
.990
A
NL
99.9
89
IAR
C
99.9
89
DO
E-A
LB
99.9
78
LAN
L 99
.978
S
NLA
99
.978
P
ante
x 99
.977
LA
NL-
DC
99
.976
LL
NL-
DC
99
.976
K
CP
99.9
73
JLab
99
.964
N
RE
L 99
.961
IN
L 99
.892
D
OE
-GTN
99
.885
N
OA
A 99
.855
O
STI
99
.852
B
echt
el
99.8
08
Am
es-L
ab
99.7
67
Lam
ont
99.7
18
OR
AU
99
.660
Out
age
Min
utes
Site Availability 2/2006 to 1/2007 Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec Jan
0
500
1000
1500
2000
AN
L 10
0.00
FN
AL
100.
00
Lam
ont
100.
00
LIG
O
100.
00
OR
NL
100.
00
PN
NL
100.
00
SN
LL
100.
00
DO
E-A
LB
99.9
99
JGI
99.9
99
LBL
99.9
99
MS
RI
99.9
99
NG
A 99
.999
S
LAC
99
.999
Yu
cca
99.9
99
NN
SA
99.9
98
SN
LA
99.9
98
LAN
L 99
.997
B
NL
99.9
94
MIT
99
.993
S
RS
99
.993
A
mes
-Lab
99
.991
P
PP
L 99
.991
IN
L 99
.989
N
ER
SC
99
.989
N
STE
C
99.9
89
NR
EL
99.9
85
LLN
L 99
.984
K
CP
99.9
81
IAR
C
99.9
73
GA
99.9
72
LAN
L-D
C
99.9
70
LLN
L-D
C
99.9
70
Pan
tex
99.9
55
DO
E-G
TN
99.9
20
JLab
99
.913
B
echt
el
99.8
96
Y12
99
.865
B
JC
99.8
64
NO
AA
99.8
64
OR
AU
99
.854
O
STI
99
.852
Out
age
Min
utes
Site Availability 2/2008 to 1/2009 Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec Jan
5 9s 41%
4 9s 22%
3 9s 37%
Improved Site Availability
5 9s 39%
4 9s 19%
3 9s 32%
2 9s 10%
5 9s 41%
4 9s 22%
3 9s 37%
ESnet Accepted Traffic (Tby/mo)
0
1000
2000
3000
4000
5000
6000
7000
Jan,
00
Jan,
01
Jan,
02
Jan,
03
Jan,
04
Jan,
05
Jan,
06
Jan,
07
Jan,
08
Top 1000 Flows
Historical ESnet Traffic Patterns
Log Plot of ESnet Monthly Accepted Traffic, January 1990 – December 2008
Tera
byte
s / m
onth
Oct 1993 1 TBy/mo.
Aug 1990 100 MBy/mo.
Jul 1998 10 TBy/mo.
38 months
57 months
40 months
Nov 2001 100 TBy/mo.
Apr 2006 1 PBy/mo.
53 months
Traffic Increases 10X Every 47 Months
July 2010 10 PBy/mo.
Network Traffic, Science Data, and Network Capacity
Historical Projection
All
Four
Dat
a S
erie
s ar
e N
orm
aliz
ed to
“1” a
t Jan
. 199
0
(HEP data courtesy of Harvey Newman, Caltech, and Richard Mount, SLAC. Climate data courtesy Dean Williams, LLNL, and the Earth Systems Grid Development Team.)
4 Pb/y 2010
40 Pb/y 2010
ESnet4 – 2010
StarLight
MAN LAN (32 A of A)
PNNL
FNL
ORNL
LLNL
GA
BNL
LANL
IP router
Lab
Optical node
SDN router Lab Link
MAN NLR 10G
30/40/50G SDN IP
50
50
30
30
50
40
40
40
40
40
40
• ESnet4 planning assumes technology advances will provide 100 Gb/s optical waves (they are 10 Gb/s now)
• The ESnet4 SDN switching/routing platform (Juniper MX960) is designed to support new 100 Gb/s network interfaces
• With capacity planning based on the ESnet 2010 wave count, we can probably assume some fraction of the core network capacity by 2012 will require 100 Gb/s interfaces
• ESnet is involved in a collaboration with Internet2, Juniper Networks (core routers), Infinera (DWDM), and Level3 (network support) to accelerate its deployment and help drive down the cost of 100G components
Beyond 2010: 100 G
• Advances in security at ESnet over the last 6 months: – Implemented Two-factor authentication for ESnet network
engineers requesting privileged access to the network management plane. Reviewed and re-defined access to network management plane.
– Upgraded Bro Intrusion Detection System • ESnet Security Peer Review – Feb 11-12
– Fed/R&E/Commercial experts reviewing ESnet security practices and procedures
• Disaster recovery improvements – Deployed Government Emergency Telecommunications Service
(GETS) numbers to key personnel – Deploying full replication of the NOC databases and servers and
Science Services databases in the NYC Qwest carrier hub
ESnet Security & Disaster Recovery
Website Redesign
• Goals – Better organization of information, easier
navigation, searchable (not everything in pdfs) but don’t want it to all be ‘push’
– Collaborative tool – upload best practices, video from conference, community calendar, staff pages
– Integration of business processes into site • “My ESnet” portal for site coordinators /
users • Exploring Google Earth or similar
network visualization – IP / SDN / MAN representation – perfSONAR performance data – OSCARS virtual circuit status
– Looking for ideas/input/suggestions.
• Network Update • OSCARS • perfSONAR • Federated Trust
Agenda
Multi-Domain Virtual Circuit Service OSCARS
The OSCARS service requirements: • Guaranteed bandwidth with resiliency
– User specified bandwidth - requested and managed in a Web Services framework
– Explicit backup paths can be requested • Traffic isolation
– Allows for high-performance, non-standard transport mechanisms that cannot co-exist with commodity TCP-based transport
• Traffic engineering (for ESnet operations) – Enables the engineering of explicit paths to meet specific requirements
• e.g. bypass congested links; using higher bandwidth, lower latency paths; etc. • Secure connections
– The circuits are “secure” to the edges of the network (the site boundary) because they are managed by the control plane of the network which is highly secure and isolated from general traffic
• End-to-end, cross-domain connections between Labs and collaborating institutions
OSCARS Current (v0.5) Implementation
User User App
AAAS Authentication Authorization Auditing Subsystem
Notification Broker API Resv API
WBUI Web Based User Interface
OSCARS Core - Reservation Management - Path Computation - Scheduling - Inter-Domain Communications
PSS Path Setup Subsystem - Network Element Interface
NS NotificationSubsystem
HTTPS HTTPS (SOAP) RMI SSHv2
ESnet Public WebServer (Proxy)
WS Interface
IDC InterDomain Controller
Source
Sink
SDN
SDN
IP
IP
IP
IP Link
ESnet WAN
SDN
Notification Call-back Event API
• Well defined inter-module interfaces • Exchange of static topology information • PCE integrated into OSCARS Core
ESnet IDC (OSCARS)
OSCARS Future Implementation
User User App
AAAS Authentication Authorization Auditing Subsystem
Notification Broker API Resv API
WBUI Web Based User Interface
OSCARS Core - Reservation Management - Scheduling - Inter-Domain Communications
PSS Path Setup Subsystem - Network Element Interface
NS NotificationSubsystem
HTTPS HTTPS (SOAP) RMI SSHv2
ESnet Public WebServer (Proxy)
WS Interface
IDC InterDomain Controller
PCE Path Computation Engine
ESnet IDC (OSCARS)
Notification Call-back Event API
Source
Sink
SDN
SDN
IP
IP
IP
IP Link
ESnet WAN
SDN
• Exchange of dynamic topology information • includes time dimension
• PCE separated from OSCARS Core • PCEs can be daisy changed • allows PCE to be pluggable • facilitates a research framework for collaboration
• Modifications needed by FNAL and BNL – Changed the reservation workflow, added a notification callback system, and added some
parameters to the OSCARS API to improve interoperability with automated provisioning agents such as LambdaStation, Terapaths and Phoebus.
• Operational VC support – As of 12/2/08, there were 16 long-term production VCs instantiated, all of which support HEP
• 4 VCs terminate at BNL • 2 VCs support LHC T0-T1 (primary and backup) • 12 VCs terminate at FNAL • 2 VCs support LHC T0-T1 (primary and backup) • For BNL and FNAL LHC T0-T1 VCs, except for the ESnet PE router at BNL (bnl-mr1.es.net) and
FNAL (fnal-mr1-es.net), there are no other common nodes (router), ports (interfaces), or links between the primary and backup VC.
• Short-term dynamic VCs – Between 1/1/08 and 12/2/08, there were roughly 2650 successful HEP centric VCs reservations
• 1950 reservations initiated by BNL using Terapaths • 1700 reservations initiated by FNAL using LambdaStation
Production OSCARS
OSCARS is a Production Service
Site VLANS ESnet PE
ESnet Core
USLHCnet (LHC OPN)
VLAN
USLHCnet VLANS
USLHCnet VLANS
USLHCnet VLANS
USLHCnet VLANS
Tier2 LHC VLANS
T2 LHC VLAN
Tier2 LHC VLANS
OSCARS setup all VLANs
OSCARS generated and managed virtual circuits at FNAL – one of the US LHC Tier 1 data centers. This circuit map (minus the yellow callouts that explain the diagram) is automatically generated by an OSCARS tool and assists the connected sites with keeping track of what circuits exist and where they terminate.
Spectrum Now Monitors OSCARS Circuits
• Network Update • OSCARS • perfSONAR • Federated Trust
Agenda
perfSONAR Services
• End-to-end monitoring service: providing useful, comprehensive, and meaningful information on the state of end-to-end paths. Supports regularly scheduled tests & archiving of results, acting as an intermediate layer, between the performance measurement tools and the diagnostic or visualization applications.
• Tools in the perfSONAR software suite: – SNMP Measurement Archive – Lookup Service – Topology Service – Circuit Status Measurement Archive – Status Measurement Archive – perfSONAR-BUOY – PingER Services
• Visualization – Allow ESnet user community to better understand our network & its capabilities. – Allow ESnet users to understand how their use impacts the backbone.
• Alarming – Automated analysis of regularly scheduled measurements to raise alerts.
ESnet Deployment Activities
• Currently deploying the hardware across the network to support adhoc measurements for debugging – OWAMP Servers – BWCTL Servers – Topology Service – Utilization Service
• perfSONAR Buoy Deployment – Between ESnet systems – To Internet2 & GEANT – To/From ESnet Sites
• Hardens the infrastructure – Continuous monitoring of servers & services – Centralized management of OS & Services configuration – Performance tuning & verifying everything is working as designed
perfSONAR R&D Activities
• Scaling & robustness enhancements • Visualization Tools
– Single Domain Tools • Utilization Browser • Topology Browser • Latency & Bandwidth Browser
– Advanced Tools • Looking across multiple domains • Looking at correlations between different types of measurements • Application or user community specific views
• Alarming • Integrating OSCARS circuits
– Topology – Utilization – Active measurements across them
• Network Update • OSCARS • perfSONAR • Federated Trust Services
Agenda
Federated Trust Services
• DOEGrids Certification Authority – New Logo and ID Mark – Operations – DOEGrids Audit progress – Cloning and Geographical Dispersion
• OpenID and Shibboleth • Authorization Services Profile Document
• Vista – IE browser support in development – Also beginning testing IE 8 browser
• ESnet 2-factor – Support ESnet 2-factor authentication token project – Add ESnet RA to list of official RAs in DOEGrids CA
• Recent problems – Dec 2008 – CA not reading own Cert Revocation Lists – CA automatically certifying customers from a peer, highly trusted
CA (CERN CA) These problems have been corrected
• All certifications since June 2007 were audited • No fraudulent certifications were discovered • By agreement with registration authorities, affected subscribers will
undergo direct reverification at next renewal (RA’s are free to require this at any time)
• (See auditing slide)
DOEGrids CA - Operations
DOEGrids CA (one of several CAs) Usage Statistics
User Certificates 9259 Total No. of Revoked Certificates 2056
Host & Service Certificates 21043 Total No. of Expired Certificates 19452
Total No. of Requests 35629 Total No. of Certificates Issued 30331
Total No. of Active Certificates 8823
ESnet SSL Server CA Certificates 50
FusionGRID CA certificates 113 * Report as of Jan 29, 2009
DOEGrids CA (Active Certificates) Usage Statistics
* Report as of Jan 29, 2009
Active DOEGrids CA Breakdown
** OSG Includes (BNL, CDF, CIGI,CMS, CompBioGrid, DES, DOSAR, DZero, Engage, Fermilab, fMRI, GADU, geant4, GLOW, GPN, GRASE, GridEx, GUGrid, i2u2, ILC, JLAB, LIGO, mariachi, MIS, nanoHUB, NWICG,
NYSGrid, OSG, OSGEDU, SBGrid, SDSS, SLAC, STAR & USATLAS)
• The Certification Practices Statement (CPS) is being “translated” to the RFC 3647 format – Audit finding – requirement – Appropriate format for interoperation
• Next step will be to correct all documentation errors identified in the audit
• Scheduling an audit of configurations, modules, and operational scripts (see Dec 2008 problems) – Feb/Mar 2009
DOEGrids CA - Audits
• DOEGrids CA and its key management hardware will be cloned and dispersed around the US – Improve Continuity of Operations and disaster
recovery issues (ESnet requirements) – Improve availability to customers – Provision for future, robust services – Current status: Testing and configuration of netHSM
hardware, and project planning
DOEGrids CA Cloning & Geographical Dispersion
• Continue efforts to promote this technology in DOE Laboratory community – won’t you join us?
• OpenID: Summer project testing OpenID provider (mostly) with simple server – Objective: Use DOEGrids CA as source of identity – Objective: Test simple application (custom, and later simplified
wiki) – See http://www.doegrids.org/OpenID/ – Roadmap for phase 2: Robust version of summer project, with
more SSO and addition of other OpenID consumers as opportunities appear
• Shibboleth: Similar roadmap as for OpenID • Many security issues to consider • WAYF/Discovery a problem for both services – perhaps
this is an opportunity for a 3rd service, CardSpace
OpenID and Shibboleth
• Shibboleth – SAML 2.0 – InCommon Federation
• OpenID – OP and demo Consumer
Identity and Federation Technology
Graphics from SWITCH