Post on 28-Jun-2018
Internet2 R&D Update
Eric Boyd Deputy Technology Officer 28 April 2009 Internet2 Spring Member Meeting
Overview
• Strategic Planning • Network Cyberinfrastructure • Information Services • DCN • perfSONAR
• Support for Network Research • Next Generation Networking
Strategic Plan Implementation
• Strategic Plan (Summer 2008) • Strategic Plan Task White Papers (April
2009) • Each written independently • Discuss current effort, environment, current
tensions, future opportunities, metrics for success
• Strategic Plan Prioritization (Next Step)
R&D Strategic Plan Focus
• Cyberinfrastructure (all councils) • Environment for Research (RAC) • Middleware (AMSAC) • Security (AMSAC) • Architecture and Operations (AOAC) We welcome and encourage community
suggestions and revisions
Overview
• Strategic Planning • Network Cyberinfrastructure • Information Services • DCN • perfSONAR
• Support for Network Research • Next Generation Networking
Cyberinfrastructure and the Internet2 Community
• Operating advance services by and for the community • e.g. Networks, Observatories, Federations
• Experimenting with developmental services • e.g. Dynamic Circuits, Distributed Monitoring, Hybrid
Networking
• Adopting new technologies • e.g. Workshops, Targeted Communities
• Partnering with like-minded organizations
Integrated Systems Approach
• What does “Integrated” mean? • Interoperable • Widely Deployed • Community Best Practices • Extensible
• Observation: Building distributed systems that operate as a larger distributed system
Distributed System Design Goals
• Take existing scientific applications, without recompilation or awareness of circuits, e.g. • Bulk File Transfer • Real Time • Video
• Exploit performance possibilities of new networking technologies
• Preserve “current politics of business,” (donʼt upset the apple cart)
• Improve efficiency of problem diagnosis (eliminate reliance on “old boy network”)
Distributed System Requirements • These distributed systems share common
requirements: • Heterogeneous network architecture • Multiple administrative entities; no central authority • Local customization of operational environment • Applications driven by orthogonal virtual organizations
• Suggests parallel design approach • Toolkit approach • Web services / defined APIs
Distributed Systems for Networks
• To build next generation networks, we need distributed software systems on top of the network hardware • Session-Application (Session-Layer tools [e.g. Phoebus], Community-
specific abstraction applications [e.g. Lambda Station, Terapaths], true applications)
• Dynamic Circuit Networks (DCN, e.g. Internet2 DCN, ESnet SDN, GÉANT2 Autobahn)
• Performance Measurement Framework (e.g. perfSONAR) • Information Services (IS)
• Discovery • Topology
• Authentication, Authorization, and Accounting (AAA, e.g. Shibboleth, etc.)
Multi-Layer Distributed System
• Design is “parallel” for each system • Hierarchical dependency relationship • Suggests abstracting common components, publication/polling architecture across boundaries
Multi-Layer Distributed System
Session-Layer Abstraction Control Plane Framework
Performance Monitoring
Information Services Federated Trust
Layer 3
Layer 2
Layer 1
Har
dwar
e So
ftwar
e/ Se
rver
s
• Design is “parallel” for each system • Hierarchical dependency relationship • Suggests abstracting common components, publication/polling architecture across boundaries • Creates a common network abstraction toolkit to present to application
Scientific and Collaborative
Application
Diagnostic Analysis and Visualization
Tools
Overview
• Strategic Planning • Network Cyberinfrastructure • Information Services • DCN • perfSONAR
• Support for Network Research • Next Generation Networking
Unified Network Information Services
• Network cyberinfrastructure services must be discoverable • In the DCN world – “Who can I talk to?” • In the perfSONAR world – “What network
measurements can I find or make?” • Both DCN and perfSONAR use
common technology for this functionality
Unified Network Information Services
• Internet2 Information Services Working Group (IS-WG) • Working to unify and extend these services
• This effort will result in a common network cyberinfrastructure discovery and information service • The “nameservers” of dynamic, visible
networks
Overview
• Strategic Planning • Network Cyberinfrastructure • Information Services • DCN • perfSONAR
• Support for Network Research • Next Generation Networking
DCN Summary
• Provides short-term dedicated bandwidth • Similar and complementary to IP networking:
• Protocol-based connections
• Connect to anyone else on the network
• Supports high-bandwidth and real-time applications • Being developed and deployed by a number of R&E
networks • More flexible (and potentially more cost-effective) than
long-term dedicated circuits
Example DCN Applications
• LHC CMS (UNL -> FNAL) • LHC Atlas (BU, UMich-> BNL) • LIGO (Syracuse -> U Wisc Milwaukee
-> Caltech) • Ultragrid (LSU -> Masaryk) • Weather Simulation (Northrop
Grumman/Utah -> NG/MAX)
Potential DCN Applications
• Distributed backup service • Automated data distribution service
(“warm” local disk repositories) • Telepresence • etc.
What is DCN?
• Dynamic Circuit Network • Generic term • Internet2 DCN is an example
• Physical network • Proto-duction Service
• DCN Software Suite (DCN SS) • One implementation of a DCN
• Combination of DRAGON and OSCARS • Jointly developed with ESnet, MAX, USC ISI East
• Interoperable with other implementations speaking the InterDomain Control Protocol (IDCP)
What is IDCP?
• InterDomain Control Protocol • Developed jointly with ESnet and GÉANT2 • Enables creation of point-to-point circuits between users
across multiple intermediate networks • Parallel to telephone network
• How it works • End-user or application sends connection request to control
plane • Control plane software automates authorization,
reservation, set up and tear down of circuits • Working with GLIF and OGF; expect IDCP to evolve as
GLIF and OGF community develops a common protocol
What is DCN? DCN Software Suite
• DCN SS is matched pair of open software services
• OSCARS (IDC) • Open source project maintained by Internet2 and
ESNet • DRAGON (DC)
• NSF-funded • Open source project maintained by Internet2, MAX,
USC ISI EAST, and George Mason University • Version 0.5 is deployed
DCN Software Suite
• Policy Service (May) • Define classes of users • Options to limit provisioning based on parameters
such as endpoints • Usage Statistics (May)
• Enhanced reporting of circuit activity for Internet2 backbone
• DCN Software Suite 0.6 (October) • Modularization of functional components
Institution A
Institution B
Regional IP Network Internet2 IP Network
Peer IP Network Regional DC Network
Internet2 DC Network Peer DC Network
Host
Router
Host
Router
Shared IP Transport
Layer 2
Layer 2 Switch
Switch
Layer 3
Layer 3 Dynamic Circuit
Network
Why Dynamic Circuits?
Dynamic Static
Can select different end-points, bandwidths, and durations
End point parameters can’t be changed
Resources are only used when needed
Resources are allocated even when not in use
Efficient – unused resources can be used by others
Inefficient – unused resources are unavailable
No charge when not in use Charged even when not in use
Green! Not green
Collaboration is inherently dynamic.
DCN and SCinet * 10GE to USLHCnet/ CalTech -Tier2 on floor to CalTech, UNL, Fermi, CERN * 10GE to Northrup Grumman -weather simulation * 10GE to Pionier - eVLBI * 10GE to Internet2 - LONI - Ultragrid - LEARN - Texas A&M * 1GE to Dutch booth - Phosphorous/IDC demo * 1GE to JGN2
Use of the DCN Software Suite Connectors Running IDC Using DCN SS CENIC No No CIC OmniPoP No No GPN Planned Planned LEARN Yes Yes LONI Yes Yes MAX Yes Yes Merit Planned Planned NOX No No NYSERNet Yes Yes PNWGP No No
Use of the DCN Software Suite (cont.)
Networks Running IDC Using DCN SS ESnet Yes Yes AutoBAHN/GEANT Yes No NetherLight Planned No JGN Yes Yes USLHCnet Yes Yes Local/ Campus Running IDC Using DCN SS Northrop Grumman Yes Yes University of Amsterdam Yes Yes
CalTech Yes Yes University of Houston Planned Planned
Where can you learn more?
• Internet2 DCN Working Group • https://spaces.internet2.edu/display/DCN/Home
• DCN Software Suite • https://wiki.internet2.edu/confluence/display/DCNSS/Home
• Java Client API • https://wiki.internet2.edu/confluence/display/DCNSS/Java+Clien
t+APITest IDC Guide • https://wiki.internet2.edu/confluence/display/DCNSS/Internet2%27
s+Test+IDC • Obtaining a Test Certificate
• https://wiki.internet2.edu/confluence/display/CPD/How+to+Request+an+IDC+User+Certificate
Moving toward production….
• Internet2 has been tasked, by its governance, to migrate the DCN from R&D towards a production service.
• As a first step in that process, a staff team has begun to work with the community to define a loose set of goals for a Pilot Service to be instituted in July 2009.
• DCN Pilot is intended to be a lightweight implementation that will allow the community to gain experience before defining the final DCN service offering at a later date.
DCN Pilot Service Status Update
• Discussions underway for 8 months with AOAC, RAC, NTAC, and DCN WG about a proposed DCN Pilot Service
• Decision made to delay fee recovery until July, 2010 • Moving towards roll-out of DCN pilot (focus on operations) by July,
2009 • Open Issues
• DCN Fee Recovery Model • How to make this an end-to-end community service, not just a backbone service • How to build demand and usage while rolling it out • Best practices guide for campus roll-out • How does this fit in the context of network cyberinfrastructure and true hybrid
networking? • Which domain research communities should we work with?
• Request connectors to try it over the next few months
Overview
• Strategic Planning • Network Cyberinfrastructure • Information Services • DCN • perfSONAR
• Support for Network Research • Next Generation Networking
perfSONAR Motivation
• Performance problems become actionable when performance characteristics can be seen by all interested parties
• Multi-domain paths are the most difficult to diagnose
perfSONAR
• perfSONAR is a multi-institution project to define a suite of services and tools to support multi-domain network measurements and monitoring activities • Tools in the perfSONAR-PS implementation:
• BWCTL/OWAMP/pS-B/PingER/SNMP/NDT/NPAD • Multiple independent implementations • Built on standards being defined in the
OGF
perfSONAR Roadmap
• Service Software Releases (May/June) • perfSONAR-PS v3.1 • Service Discovery • Topology Sharing • Interface and Circuit status • Active meshes of active latency and throughput
• OWAMP/BWCTL • Alarms on exceptional results
• pS-NPToolkit v3.1 • Easily deployed single perfSONAR Performance Node (ISO) (Exists as v2) • Bug fixes/active latency grids/enhanced admin/alarms (v3.1 June/July)
• perfSONAR Software Distribution Service (June/July) • Performance Analysis GUIs (Fall 09) • perfSONAR Performance Grids
• Manage a set of perfSONAR Performance Nodes
Demo Monitoring
• Spring Member Meeting Demo • Cisco Telepresence
• Endpoints • Harvard • Crystal Gateway Hotel
• Goals • Measure delay/jitter/loss between these points • Be able to fix any issues that come up
Regular Monitoring for Latency Sensitive Applications
• Cisco Telepresence Limits • 10 ms jitter • 160 ms delay • 0.05% loss
• Polycom Limits • 30-35 ms jitter • 300 ms delay • <1% loss
Demo Monitoring – Enabling Debugging
• Path Decomposition • Deploy more hosts and run regular latency tests on
smaller segments of the path between end hosts • Shows where on the path to look for the problemʼs
cause • Path Measurements
• Obtain utilization statistics from routers along end-to-end path.
• Allow drilling down to better understand why problems are occurring
Demo Deployment
Harvard
Northern Crossroads
Hotel
Mid-Atlantic Crossroads
Internet2 POP
Internet2 POP
Analysis Software
• Software was written or modified to make it easy to view and understand the data. • Provides a variety of views • Status of the entire network • Status of a given host • Status of a given path • Alerting mechanism when problems are seen
Network Health
• A grid view of the network describing the latency, jitter and loss between all hosts
Path Status
• Shows graphs of jitter and loss between hosts along with interface utilization for the path.
Nagios Alarms
• Alerts administrators when problems are seen • Easy integration into NOC reporting systems.
Network Performance Analysis
• Several Potential Issues Identified • Highly utilized link in the path • Cross traffic • Test machine capability/quality • Other software running on the hosts • NTP Drift
• All were solved and verified through diagnostics and monitoring
Highly Utilized Link
• Initial observation: High jitter values observed between Hotel and Harvard.
• Process: Isolate where the Jitter is happening.
High Jitter Seen
Harvard
Northern Crossroads
Hotel
Mid-Atlantic Crossroads
Internet2 POP
Internet2 POP
Jitter still on shorter path
Harvard
Northern Crossroads
Hotel
Mid-Atlantic Crossroads
Internet2 POP
Internet2 POP
Jitter still on shorter path
Harvard
Northern Crossroads
Hotel
Mid-Atlantic Crossroads
Internet2 POP
Internet2 POP
Jitter still on shorter path
Harvard
Northern Crossroads
Hotel
Mid-Atlantic Crossroads
Internet2 POP
Internet2 POP
Clean between Hotel/MAX Jitter issue seems to be between MAX/Internet2(WASH)
Harvard
Northern Crossroads
Hotel
Mid-Atlantic Crossroads
Internet2 POP
Internet2 POP
Highly Utilized Link
• What we know via OWAMP • Jitter is not between Hotel and MAX • Jitter is somewhere between MAX and New York
• Next Steps • Drill down using alternate data sources • What do we have access to?
• SNMP on MAX, Internet2 Backbone (via perfSONAR of course!)
• Can inquire about NOX/Harvard if necessary
Highly Utilized Link
• Examine each leg of the path: • Hotel to College Park • College Park Core • College Park to Level3 (McClean, VA) • Internet2 Uplink
• Identify points of congestion • 1g Uplink from Hotel to College Park • 10g Max Core • 2.5g Internet2 Uplink
Highly Utilized Link - Results
• Potential Solutions • Identify flows, re-engineer traffic • Re-plumb the demo path • Increase Capacity
• Results • Increase MAX Headroom to 10G
Overview
• Strategic Planning • Network Cyberinfrastructure • Information Services • DCN • perfSONAR
• Support for Network Research • Next Generation Networking
General Support: Internet2 Observatory
• Data Collections • Routing, Netflow, Latency, Bandwidth,
Utilization, Router, Logging • Interactive via proxy; also trouble tickets • More being exported via perfSONAR • Starting to expand to L2/DCN
• Collocation for Network Research Projects • PlanetLab / VINI Nodes at Router Nodes • 100x100 Nodes (NetFPGA) Nodes
Making Support Less Ad-Hoc
• The Research Advisory Council has commissioned a Network Research Review Committee • Examine policies on passive data collection,
potentially recommend changes • Look at Internet2 network research priorities • Review ad-hoc requests that require resources
• http://www.internet2.edu/networkresearch/nrrc.html
100x100 Project
• Original clean slate project; get 100Mbps to 100M homes
• In last year deployed NetFPGA PCs for Nick McKeown of Stanford. • NEWY, HOUS, LOSA, (WASH)
NetFPGA possibilities
• Programmable routers: demonstrated reduced need for buffers in highly aggregated paths at SIGCOMM 08
• Switches: demonstrated OpenFlow at GEC3 and SC|08
GENI: Internet2 Backbone Contribution • A Dedicated 10 Gbps Wave on the Internet2
DWDM system • Participated with EMUlab / University of Utah in first (Spiral
1) solicitation • Potential use by other projects whenever possible
• Have an MoU with the GPO for use of the Wave • GPO controls how wave is used
• Access is through regional and campus networks, except for collocated equipment • IP Network • Dynamic Circuit Network (DCN) • Direct Connectivity for Control Framework
Current GPO projects
• ProtoGENI [ProtoGENI] R. Ricci/Utah • Internet-Scale Overlay Hosting
[PlanetLab] J. Turner/WUSTL
ProtoGENI
• Take Utahʼs Emulab software (computers + network in a room) and GENI-ize it. Work to do that funded by NSF.
• Deploy this nationwide, as a substrate example for GENI. Include control of switches to slice network.
Internet-Scale Overlay Hosting
• Similar to ProtoGENI, create a programmable node out of off-the-shelf parts. Use parts that would make for a better network element – focus on routing/switching not on complete experiments.
• Part of the PlanetLab framework (as opposed to ProtoGENI, so exercise different control plane)
Year 1 (near term) deployment
• Looking to deploy 3 nodes. • KANS, SALT, {WASH} • First two due by 1-July. • Two projects have agreed to collocate
and share equipment.
Potential Future Projects
• At the last GENI Engineering Conference, emphasis on “Layer 2 connectivity” to enable non-IP projects • Specific need unclear; would require more
equipment to multiplex onto 10GE; DCN possible
• Unknown: results of Spiral 2 solicitation (perhaps by end June?)
Parting Thought
• Utilize some of these testbeds to try out new network architecture ideas?
• Partner with some research groups? • For example, perhaps use 100x100
NetFPGA configured as OpenFlow switches (or GENI) to look at hybrid or different IP network strategies.
Overview
• Strategic Planning • Network Cyberinfrastructure • Information Services • DCN • perfSONAR
• Support for Network Research • Next Generation Networking
What should the next Internet2 Network Look Like?
• Beginning to be considered by Councils and WGs
• Questions around how to fund it • Sufficient depreciation? • How to fund new, advanced services?
• Evolution (series of small decisions) vs. a brand new network
Role of R&D in Network Design
• We have a range of options available • Some are well understood • Some are not
• Role of R&D effort is to explore the network design space
• What might we consider?
Next Generation Network
• Many axes of differentiation • Heterogeneous vendors • Heterogeneous transport technologies • Application vs. network driven transport
technology selection • API (e.g. Phoebus, Lambda Station, Terapaths) • Flow analysis (e.g. FNAL flow analysis)
• Decouple control plane from data plane • e.g. OpenFlow
• Run multiple virtual networks (some experimental, some production) on the same infrastructure • e.g. OpenFlow
Design Cycle
• General Design Cycle • New Ideas (Chief Scientist, Network Research
community, Internet2 WGs) • Implementation (CTO, Internet2 WGs) • Operationalization (ED of Network Services,
IU NOC) • Not every idea makes it through all 3
phases • Aspects of particular technologies may
exist concurrently in different phases
Conclusion
• Strategic planning prioritization and implementation underway
• Development of Network Cyberinfrastructure well underway • Calling on regionals and campuses to deploy
advanced services • Support for network research • Need for research in many arenas to inform
next generation network design • Current funding paradigm does not map well to
design cycle