OCC-Executive-Summary-20150323

11
© OpenCloud Connect 2015. Any reproduction of this document, or any portion thereof, shall contain the following statement: "Reproduced with permission of OpenCloud Connect." No user of this document is authorized to modify any of the information contained herein. 1 OpenCloud Connect Technical White Paper The Benefits and Challenges Ahead for Cloud Operators Executive Summary

Transcript of OCC-Executive-Summary-20150323

Page 1: OCC-Executive-Summary-20150323

© OpenCloud Connect 2015. Any reproduction of this document, or any portion thereof, shall contain the following statement: "Reproduced with permission of OpenCloud Connect." No user of this document is authorized to modify any of the information contained herein.

1

OpenCloud Connect Technical White Paper

The Benefits and Challenges Ahead for Cloud Operators

Executive Summary

Page 2: OCC-Executive-Summary-20150323

© OpenCloud Connect 2015. Any reproduction of this document, or any portion thereof, shall contain the following statement: "Reproduced with permission of OpenCloud Connect." No user of this document is authorized to modify any of the information contained herein.

2

Contributing Organizations and Authors

Alcatel-Lucent

Sunil Khandekar, President & CEO, Nuage Networks

Avaya

Dan Romascanu, Director

Ciena

John Hawkins, Senior Product Marketing Manager

Citrix

Tom Davies, Senior Director, Cloud Platforms

Coresite

Ted Chamberlin, Vice President Market Development, Cloud

Cyan

Abel Tong, Director of Product Management

Ericsson

Scott Mansfield, Principle Engineer

Equinix

Lane Patterson, Chief Technologist

HP

Dave Larson, VP & CTO, HP Networking

Huawei

Justin Dustzadeh, CTO & VP, Technology Strategy, Networks

Juniper Networks

Doug Wills, Senior Director, Product Marketing & Product Management

PCCW Global

Shahar Steiff, AVP Business Operations

Spirent Communications

Jeffrey Schmitz, CMO

Tata Communications

James Walker, Vice President, Managed Network Services

Telx

Les Williams, Director, Product Development & Strategic Partnerships

Verizon

Mike Bencheck, Verizon Fellow

Additional contributors:

Sam Youn, Director Network Architecture, Equinix; Lucy Yong, Principle Engineer, Huawei Communications;

Haruki Sonehara, Senior Product Manager, Juniper Networks;

Paul To, Director Product Marketing – SDN & Data Center, Spirent Communications;

Henry Bohannon, Senior Director Head of Ethernet Product Management, Tata Communications;

Technical writers: John O’ Shaughnessy, Lionel Snell and Daniel Bar-Lev

Page 3: OCC-Executive-Summary-20150323

© OpenCloud Connect 2015. Any reproduction of this document, or any portion thereof, shall contain the following statement: "Reproduced with permission of OpenCloud Connect." No user of this document is authorized to modify any of the information contained herein.

3

Abstract

This whitepaper explores the benefits and challenges confronting service providers, large businesses and

network vendors in the deployment of Ethernet-based cloud services. It outlines potential areas for

collaboration, focused on open standards, data center interoperability requirements and cloud service use

cases. It also identifies core technologies such as OpenFlow, Network Function Virtualization (NFV) and

Software Defined Networking (SDN) that could be used by cloud operators to improve work-flow

automation, network programmability and cloud service delivery over time.

At the center of these changes is Ethernet. As a transport protocol, Ethernet has emerged as the network of

networks for the last 30 years. Its unique ability to support a flexible and dynamic cloud infrastructure will

be explored in some detail in this whitepaper. The paper concludes with a summary of the specific

initiatives that OpenCloud Connect* is evaluating to advance the use of open and standard technologies

that may lower the cost of delivering cloud services in the future and accelerate their uptake .

* OCC member organizations include Data Center Providers, WAN Service Providers, System Integrators,

Network Equipment Manufacturers, and CSPs. Activities include identifying required technical

approaches and uses cases, developing specifications and implementation agreements, promoting solutions

developed by the OCC, identifying required business processes, and developing recommended best

practices. For more information, please visit http://www.opencloudconnect.org.

Page 4: OCC-Executive-Summary-20150323

© OpenCloud Connect 2015. Any reproduction of this document, or any portion thereof, shall contain the following statement: "Reproduced with permission of OpenCloud Connect." No user of this document is authorized to modify any of the information contained herein.

4

Introduction

Fifteen years ago, Carrier Ethernet fundamentally changed the way service providers would deliver

Internet, phone and TV services. About 10 years ago, virtualization began to reshape the server and storage

business for large enterprises, lowering capital and operation costs 30-50 percent. Today, Cloud Ethernet

infrastructures are poised to lower the development costs and improve the delivery times of both cloud and

mobile services.

The cloud is expected to reshape the futures of many businesses for the next 10+ years. Most cloud experts

forecast three waves of change:

1. Datacenters are consolidating. Fortune 500 businesses are spending billions of dollars per year to

consolidate smaller data centers into large cloud infrastructures, allowing them to shed enormous

IT operation costs (e.g. Amazon, Boeing, Walmart).

2. Service providers are consolidating. Traditional telecom providers are shedding fixed line assets

to reinvest in mobile and cloud infrastructures (e.g. SingTel, Vodafone, and Verizon). This

telecom activity is driving a new wave of service provider consolidation and is shifting capital

spending priorities across all geographies.

3. Software and hardware vendors re-inventing themselves, investing in IT or network services as a

hedge to first protect and then grow their traditional software and hardware businesses (e.g. IBM,

Microsoft, Oracle).

One of the primary goals of OpenCloud Connect is to advance technology standards, vendor

interoperability and best design practices. These initiatives will help businesses create a flexible and

dynamic cloud infrastructure. It will also reduce the cost and time-to-market of developing new cloud and

mobile applications.

Ethernet

Ethernet is known as ‘The Network for Networks’ for good reason. Created in 1973 and established as an

IEEE standard in 1983, Ethernet is the network of choice for campus, wireless, mobile networks. In data

centers, it is used for physical connectivity of Virtual Machines (VMs). Ethernet is the transport protocol

that connect VMs within servers, racks of servers, and between data centers. Carrier Ethernet and data

center bridging is used to connect data centers across metro, regional, and broadly distributed geographic

areas. Simply put, Ethernet has become the primary building block for cloud network connectivity.

Ethernet offers four main benefits for cloud operators:

Automation

Programmability

Interoperability

Cost-effectiveness

Page 5: OCC-Executive-Summary-20150323

© OpenCloud Connect 2015. Any reproduction of this document, or any portion thereof, shall contain the following statement: "Reproduced with permission of OpenCloud Connect." No user of this document is authorized to modify any of the information contained herein.

5

The Cloud

Cloud data centers represent the power grids of the 21st century. Users can tap into any IP network and

dynamically provision compute and storage assets to help run their business. Business are no longer

required to make large capital investments in IT infrastructure and IT applications – instead, they can "rent"

IT resources using the cloud.

Five essential characteristics of the cloud are as follows:

1. On-demand self-service – Users can easily provision and release resources when and where they need

it, without requiring human interactions with the service providers.

2. Broad network access – Cloud resources and applications are readily accessible over a broad network.

A variety of technologies and services can be used, such as Carrier Ethernet and Cloud

3. Resource pooling – Resources are pooled to serve multiple users in a "rent-what-you-use model”.

4. Rapid elasticity – Compute, storage and network assets are elastic for cloud users increasing service

agility

5. Measured services – Cloud usage can be monitored and recorded to support usage-based billing

model. Big data analytics also can be used monetize network and user traffic.

Challenges for Cloud Service Providers

While it is clear that the majority of IT decision makers plan to deploy a hybrid cloud strategy, technical

challenges remain. These challenges include:

Consistent security and network policy enforcement Enterprises usually have their own set of network and security policies enforced in their on-premise

infrastructure, using tools like firewalls, load balancers, IDS/IPS, and traffic monitoring,. They seek to

extend their own policies and enforcement architectures into the hybrid cloud rather than adopting a

different set of standards imposed by a cloud provider.

Security & Privacy Recent headline-garnering security breaches in the defense industry have significantly heightened

cloud security and privacy concerns. In some instances, security breaches have tempered enterprises'

willingness to put sensitive data and applications on the cloud. Cloud providers need to prepare for

data residency and data privacy rules and regulations that will be imposed by governments worldwide,

and find creative ways to assure security and privacy of data.

Layer 2 scale & VM migration between on-premise data centers and cloud data centers Since the whole point of hybrid cloud is elastic scaling, the Virtual Machine to Physical Machine ratio,

and the rate of VM movement is going to grow dramatically during the next few years. Additionally,

the number of MACs and VLANs are also skyrocketing, as data centers grow bigger to support large-

scale multi-tenancy requirements. The combination of these factors present significant challenges to

current data center network architectures.

Compatibility and complexity – On-premise data centers use a wide variety of hypervisors, VM

templates, virtual switches and network virtualization. To support elastic scaling and dynamic

movement of VMs, cloud providers require an open and secure “data center bridging” approach to

improve the interoperability of compute, storage, and networking assets.

Slow network provisioning compared to compute/storage – Today's network services are typically

provisioned in days or weeks, while VMs and Virtual Storage are provisioned in minutes or hours. To

compensate, providers and users need to over-provision the network pipes to handle peak loads, and

VM architectures rely on overlay tunnels to decouple VM provisioning from the underlying network.

Page 6: OCC-Executive-Summary-20150323

© OpenCloud Connect 2015. Any reproduction of this document, or any portion thereof, shall contain the following statement: "Reproduced with permission of OpenCloud Connect." No user of this document is authorized to modify any of the information contained herein.

6

Rigidity in network billing / business models –The elastic scaling of cloud computing means that

users should be able to "rent" computing and storage resources on demand and pay only for what they

use. Usage-based billing is still uncommon for most network services. Cloud providers need to

implement metering mechanisms for both compute, storage and network resources. This will create

new business models to support usage-based billing.

Lack of bandwidth guarantees– Together with CoS and network changes due to topology change or

switch failure, the lack of bandwidth guarantees can cause critical service interruption due to path

determination algorithm taking up to 30 seconds. On the other hand, transport networks (SONET,

Carrier Ethernet, MPLS-TP, etc) do provide a means to guarantee bandwidth and to ensure carrier-

class resiliency. Unfortunately, some of these approaches take time to set-up and may require manual

configuration by the network operator. In recent years, additional capabilities have been added to

Ethernet, under the IEEE Data Center Bridging (DCB) work group – however, there is still a gap

between capabilities offered by transport network to and the networks capabilities available inside the

data center.

The gains to be made from solving the above challenges have driven industry stakeholders to develop

several enabling technologies that focus on increasing the agility and lowering the cost of cloud computing

infrastructures. None of these enabling technologies provides a complete solution for cloud service

providers. Even where a combination of these and other technologies would be appropriate, there is very

little in the way of implementation agreements between network operators or industry best practices to

improve deployments. This is a key area that OpenCloud Connect can and will contribute.

The following is a summary of key technology layers proposed:

Carrier Ethernet Services – Challenges posed in a very large scaled cloud infrastructure environment

when using the currently defined Carrier Ethernet services (CE 2.0) include: Lack of support for the “On-

demand Self-service and Rapid Elasticity” characteristics of cloud services

Network Virtualization – Network virtualization separates controller and forwarding tasks and

leverages the capabilities of x86 systems servers, allowing services to elastically scale based on the needs

of the cloud service. This dramatically reduces the time-to-market for these applications and allows the

introduction of a lower cost, usage-based business model.

Network Functions Virtualization – NFV can improve agility and speed of provisioning of network

certain functions between campus, hybrid and cloud networks. This allows enterprises to mirror their own

network and security policies using a service chain of virtual firewalls, load balancers, and other network

applications of their own choice. The challenges implementing NFV are:

Performance and scale varies widely in virtual switches and virtual network appliances

An intricate mix of hardware and software tuning factors can greatly impact performance

It is difficult to benchmark performance and protect QoS, QoE and SLA in a virtual service chain.

Virtual Overlay Networks – The challenges that Virtual Overlay Networks pose for the cloud service

providers include:

WAN networks are not aware of the existence of virtual overlay networks, which create

underlying network traffic engineering challenges for WAN Service Providers

Tunnels often carry aggregated traffic from different virtual overlay networks, which complicates

QoS treatment and reporting

Page 7: OCC-Executive-Summary-20150323

© OpenCloud Connect 2015. Any reproduction of this document, or any portion thereof, shall contain the following statement: "Reproduced with permission of OpenCloud Connect." No user of this document is authorized to modify any of the information contained herein.

7

Difficulty in trouble isolation between the overlay tunnel and the WAN due to lack of coordination

or overlap of their respective OAMs

VM movements in the overlay virtual network without consideration of underlying network

resource may create unexpected consequences for cloud applications

In the case of VXLAN, the maximum payload size is 1464 bytes, which means it cannot contain

802.3/802.1Q packet in one frame which therefore has to be fragmented. This fragmentation can

impact a CPU base switch from performance perspective. In a hyper scale cloud environment, it is

not feasible to manually change and monitor every interface.

In the case of NVGRE, the user is restricted to a Microsoft HyperV environment. Use of NVGRE

in other environments is problematic when there are packet fragments. This is sometimes due to

the lack of a checksum flags in the IP payload.

In the case of a STT, a TCP-like header (SEQ/Window size) and behavior (SYN/ACK) is required

and reduces the traffic throughput unless the customer invests in TCP offload engines

(NIC). Also, if intermediate devices such as Firewall, Intrusion Prevention Systems, Load

Balancer or a Proxy receive this flow, there is risk of an STT packet drop.

Software Defined Networking (SDN) and OpenFlow – The challenges posed for service providers

in the cloud services market include:

Centralization – The centralization paradigm in OpenFlow means that OpenFlow controllers may

have to cope with exponential workload/resource growth. In other words, when a cloud provider is

managing millions or tens of millions of VMs that need to be created, terminated and transferred at

high rates, the OpenFlow switch has to ensure that it can handle these thousands of tens of

thousands of events in short time intervals without any disruption.

Packet Punting – The OpenFlow paradigm also requires the forwarding of the first one or more

packets in a flow to the controller. This model can break down quickly in an environment with

tens of millions of VM flows.

Security & Scalability - The centralized nature of an OpenFlow controller also raises scalability

and security issues for some large cloud operators. These challenges can be addressed but more

work is required between the ONF, ETSI and newly formed OpenCloud Connect. Today, most

large cloud operators are considering a multi-protocol hybrid use-case (e.g. Amazon, Facebook,

Google), which combines the programmability of OpenFlow with the scalability of secure network

protocols such as MPLS and BGP. Forwarding protocol – OpenFlow defines a unique packet

format that differs from the ubiquitous Ethernet standard. Having both OpenFlow and Ethernet-

based cloud architecture may well mean simultaneously managing two different architectures in

the datacenter. An example of this situation is the one currently faced with Fiber Channel and

Ethernet, which reduces efficiency and increases complexity.

Cloud Orchestration Framework – The challenges posed when using Cloud Orchestration

Frameworks include:

The cloud orchestration ecosystem is complex with several competing platforms. Interoperability

is a concern and cloud providers often have to support multiple standards in order to on-ramp

enterprise customers

Deployment of these orchestration platforms can be complex. Some open source components

require a large amount of system integration and/or customization efforts.

Page 8: OCC-Executive-Summary-20150323

© OpenCloud Connect 2015. Any reproduction of this document, or any portion thereof, shall contain the following statement: "Reproduced with permission of OpenCloud Connect." No user of this document is authorized to modify any of the information contained herein.

8

Recommended OpenCloud Connect Initiatives

Carrier Ethernet Meets Network Virtualization

Just as SDN and NFV technologies bring agility and programmability to the Ethernet fabric within data

center networks, opportunities exist to apply the same or similar solution to Carrier Ethernet for inter-cloud

connections.

For example: SDN technologies like NFV and OpenFlow can greatly improve agility and programmability

of Ethernet-based cloud services by providing a single controller or management dashboard that makes it

easier to program networks. Getting a lot of attention are NFV technologies that can dynamic scale and

deliver service chaining for security, traffic management and QoS. NFV also can support the rapid

elasticity of cloud services. OpenFlow can play a complementary role lowering certain opex costs as a

network device driver.

Also Cloud Orchestration for Carrier Ethernet – the already well defined Ethernet service types, UNI/ENNI

interfaces and service attributes can be exposed via API to cloud orchestration platforms like OpenStack or

CloudStack, thus enabling true end-to-end provisioning of services across inter and intra data center

compute and network infrastructure.

End to End Provisioning

End to End Provisioning of a cloud service is agnostic to the devices and various technologies that

comprise the network. Carrier Ethernet enabled devices in a network provide for greater flexibility in the

types of services that can be provided. It is the intention of OpenCloud Connect to work towards defining

the following in regards to Cloud services using Ethernet transport.

APIs

Cloud Service Provider (CSP) to Telecom Service Provider (TSP) and End User to CSP – the APIs must

provide asynchronous notification of provisioning status, transactional awareness and rollback, while being

secure. The requirements are for low to zero human intervention during provisioning, on-demand delivery

and maybe fault and performance monitoring too.

Cloud Service Provisioning

Despite the popularity of cloud services, there is a significant gap between the Cloud Provider leaders and

the rest of the industry when it comes down to provisioning cloud resources. OpenCloud Connect will

consider network/datacenter/carriers planning/provisioning infrastructure resources according to statistical

usage and forecast, rather than individual cloud applications. Decoupling cloud application from the

physical infrastructure will enable that.

Deterministic Traffic Performance and Behavior

When cloud service providers (CSPs) purchase services from telecom service providers (TSPs) to

interconnect two or more data centers and/or end users, they have traffic performance and behavior

expectations. These may be related to a level of performance required for the application, or to meeting

legal requirements for keeping data within specified geographic boundaries at all time or to the telecom

service ensuring a redundant path between the two end points.

Page 9: OCC-Executive-Summary-20150323

© OpenCloud Connect 2015. Any reproduction of this document, or any portion thereof, shall contain the following statement: "Reproduced with permission of OpenCloud Connect." No user of this document is authorized to modify any of the information contained herein.

9

Service Performance for Application Fulfillment

An application being provided to an end user by a cloud service provider may have demanding

performance requirements in terms of latency or jitter for example – eg for real-time financial trading,

storage, VM migration, video and some VoIP services. Service performance is not only a critical

consideration where it relates to inter-DC connectivity, it may also extend to the end user of the application

or service who may have other limitations relating to response time or user experience for particular

applications.

Legal Requirements for Data Management

Several governments have already passed legislation on areas like the transfer and storage of financial or

personal data, and more countries are considering implementing such rules. For a TSP to ensure that they

comply with the requirements of the CSP, the TSP must have a way to provide a deterministic path for any

telecom service that they provide, and to maintain both its performance and geographic constraints at all

times. Any circuit which does not comply with the CSP's needs should be marked as “down” – this is a

significant change as enterprises have previously wanted a service to remain “up” as long as traffic was

able to be passed end to end.

Redundant/Diverse Telecom Facilities

CSPs may have telecom facilities sufficiently important to demand redundant or diverse telecom facilities

between two or more data centers. To maintain the level of protection that the CSP intended, the TSP will

need to define a deterministic service path(s) to the CSP that ensures it stays on a specific fiber route,

distinct from other services. If the CSP has more than one TSP, this deterministic redundancy is critical.

OpenCloud Connect Involvement

OpenCloud Connect provides the bridge between CSPs and TSPs. As a part of this bridge, OpenCloud

clearly defines the requirements of CSPs in terms that TSPs can understand. The Forum plans to develop

requirements for service definitions of Deterministic Traffic Service that can be used by both CSPs and

TSPs. This common set of definitions will allow CSPs to clearly articulate their requirements and allow

TSPs to state the parameters offered by their telecom services, which could be a wide range of attributes

such as: physical route, link capacity (which may be variable or asymmetric), latency, resiliency, and

perhaps even cost. In turn, these attributes need to be able to be signaled to, and understood by, not only the

networking equipment impacted, but also the virtualization and orchestration layers.

Overlay Tunnel

An overlay tunnel is used to encapsulate one packet in another packet and forward the encapsulated packet

to the endpoint where it is de-encapsulated. Network overlays leverage this "packet in a packet" technique

to provide secure multi-tenancy services over the same underlying network. The important point is that

these overlay tunnels are not visible to service providers.

Overlay tunnels will be used in data centers but pose serious scaling challenges in the WAN. OpenCloud

Connect will address this by focusing on how best the WAN connects between two overlay tunnels in DCs,

or a WAN connects a set of overlay tunnels between two DCs.

Page 10: OCC-Executive-Summary-20150323

© OpenCloud Connect 2015. Any reproduction of this document, or any portion thereof, shall contain the following statement: "Reproduced with permission of OpenCloud Connect." No user of this document is authorized to modify any of the information contained herein.

10

VPNs over IP/MPLS

Today, SP networks also use layer technologies to support VPNs over one IP/MPLS network. The

difference between the virtual overlay networks and SP VPN implementation is that the latter is physical

network and requires that the service provider provision the service prior to the VPN sites interconnection.

This characteristic gives the service provider the opportunity to plan the infrastructure network resource

ahead, but is a problem for cloud applications.

Collaboration with other SDOs

An important part of the work of OpenCloud Connect will be to understand the work of other SDOs and to

encourage the adoption and inclusion of relevant standards into the framework and implementation a

Conclusion

The networking industry is currently facing a series of inflection points. Advances across several

disciplines are disrupting traditional network and compute business models. The cloud is reshaping the

future of service providers and large enterprises – as much as Internet, VoIP and mobile devices have done

in the past. The exponential growth in mobile and cloud applications are also creating new market

dynamics that force incumbent businesses evaluate their business models and product futures.

Simply put: adapt or die.

The need to efficiently deliver differentiated cloud services at lower costs is shared between service

providers and network vendor alike – the next step is to reach consensus and a shared vision on cloud

Ethernet architectures and technology priorities. The industry must unite on open standards,

interoperability and easy deployability of cloud services. Inaction will usher in a future shaped by

proprietary technologies and disruptive new businesses that will leave incumbent vendors and

telecommunications companies stranded on islands of old world infrastructure.

Industry stakeholders would do well to consider recent comparable successes – such as in the delivery of

Carrier Ethernet services – where disparate perspectives were aligned via the MEF around the goals of

widespread interoperability and Ethernet service certification. This laid the groundwork for IT vendors and

service providers to create a $50B Carrier Ethernet services market today that is still growing. The launch

of OpenCloud Connect offers an equally exciting opportunity to shape the future of cloud services.

Page 11: OCC-Executive-Summary-20150323

© OpenCloud Connect 2015. Any reproduction of this document, or any portion thereof, shall contain the following statement: "Reproduced with permission of OpenCloud Connect." No user of this document is authorized to modify any of the information contained herein.

11

For further information please contact OpenCloud Connect:

OCC President: OCC Chairman:

James Walker, VP Managed Network Services Jeff Schmitz, CMO

Tata Communications Spirent Communications

Email: [email protected] Email: [email protected]

OCC Technical co-chairs:

Sam Youn, Director Network Architecture Mike Bencheck, Verizon Fellow

Equinix Verizon

Email: [email protected] Email: [email protected]

OCC Marketing co-chairs:

Doug Wills, Senior Director, Product Marketing Henry Bohannon, Senior Director, Head of Product

& Product Management Ethernet Product Management

Juniper Networks Tata Communications

Email: [email protected] Email:[email protected]

Membership enquiries:

Mark Fox

Tel: USA: +1 408 504 8665 or UK: +44 (0) 7836 248110

Email: [email protected]

OCC administrative contact: Alysia Bennett, Tel: +1 310 632 2880 Email: [email protected]

Press/Analyst Enquiries: Kate Innes Tel: +44 (0) 7825 395038 or +44 (0) 1672 550123 Email: [email protected]

OpenCloud Connect, 6033 W. Century Boulevard, Suite 1107, Los Angeles, CA 90045 USA