Mellanox Storage Solutions

13
Mellanox Storage Solutions Yaron Haviv, VP Datacenter Solutions VMworld 2013 San Francisco, CA

description

Presented by Yaron Haviv, Mellanox VP/Datacenter Solutions VMWorld 2013

Transcript of Mellanox Storage Solutions

Page 1: Mellanox Storage Solutions

Mellanox Storage Solutions

Yaron Haviv, VP Datacenter Solutions

VMworld 2013 – San Francisco, CA

Page 2: Mellanox Storage Solutions

© 2012 Mellanox Technologies 2 - Mellanox Confidential - - Mellanox Confidential - - Mellanox Confidential -

11_2012_v4

Maximizing Data Center Return on Investment

97% reduction in database recovery time

• Case: Tier-1 Fortune100 Web 2.0 company

Database performance improved up to 10X

• Cases: Oracle, Teradata, IBM, Microsoft

Big Data needs big pipes

• high throughput, low latency server and storage interconnect

2X faster data analytics = expose your data value!

• With Mellanox 56Gb/s RDMA interconnect solutions

3x more VMs per physical server

33% lower application cost • Cases: Microsoft, Oracle, Atlantic, Profitbricks and more

Consolidation of network and storage I/O for lower OPEX

More than 10X higher storage performance

Millions of IOPS

60% Lower TCO, 50% Lower CAPEX

From 7 Days to 4 Hours!

3X the Virtual Machines

At 33% lower Cost Storage

Page 3: Mellanox Storage Solutions

© 2012 Mellanox Technologies 3 - Mellanox Confidential - - Mellanox Confidential - - Mellanox Confidential -

11_2012_v4

SSD Adoption grows significantly, Driving the Need for Faster I/O

Source: IT Brand Pulse

SSDs are 100x Faster, Require Faster Networks and RDMA

Page 4: Mellanox Storage Solutions

© 2012 Mellanox Technologies 4 - Mellanox Confidential - - Mellanox Confidential - - Mellanox Confidential -

11_2012_v4

The Storage Delivery Bottleneck

+ = 12GB/s =

Server

24 x 2.5” SATA 3 SSDs

15 x 8Gb/s Fibre

Channel Ports

2 x 40-56Gb/s IB/Eth port

(with RDMA)

OR

OR

SSD & Flash Mandate High-Speed Interconnect

10 x 10Gb/s iSCSI

Ports (with offload)

Page 5: Mellanox Storage Solutions

© 2012 Mellanox Technologies 5 - Mellanox Confidential - - Mellanox Confidential - - Mellanox Confidential -

11_2012_v4

Solving the Storage (Synchronous) IOPs Bottleneck With RDMA

100usec 200usec 6000usec

25

usec

1 us

20 usec

10

usec

The Old Days

(~6msec)

Software Disk

With SSDs

(~0.5msec)

With Fast Network

(~0.2msec)

With RDMA

(~0.05msec)

Network

100usec 200usec

200usec 25

usec

25

usec

180 IOPs

3000 IOPs

4300 IOPs

20,000 IOPs

Synchronous (back to back)

With Full OS Bypass

& Cache

(~0.007msec)

1 us

6

us

3

us

100,000 IOPs

Synchronous

Page 6: Mellanox Storage Solutions

© 2012 Mellanox Technologies 6 - Mellanox Confidential - - Mellanox Confidential - - Mellanox Confidential -

11_2012_v4

FC 10GbE/TCP 10GbE/RoCE 40GbE/RoCE InfiniBand

Block

NAS and Object

Big Data (Hadoop)

Storage Backplane/Clustering

Messaging

Compare Interconnect Technologies

FC 10GbE/TCP 10GbE/RoCE 40GbE/RoCE InfiniBand

Bandwidth [GB/s] 0.8/1.6 1.25 1.25 5 7

$/GBps [NIC/Switch]* 500 / 500 200 / 150 200 / 150 120 / 90 80 / 50

Credit Based (Lossless) ** **

Built in L2 Multi-path

Latency

Technology Features:

Storage Application/Protocol Support:

* Based on Google Product Search

** Mellanox end to end can be configured as true lossless Mellanox

Page 7: Mellanox Storage Solutions

© 2012 Mellanox Technologies 7 - Mellanox Confidential - - Mellanox Confidential - - Mellanox Confidential -

11_2012_v4

Standard iSCSI over TCP/IP

iSCSI’s main performance deficiencies stem from TCP/IP:

• TCP is a complex protocol requiring significant processing

• Stream based, making it hard to separate data and headers

• Requires copies that increase latency and CPU overhead

• Using checksums requiring additional CRCs in the ULP

BHS AHS HD Data DD

Basic Header

Segment

Additional

Header

Segment

(optional)

Header Digest

(optional)

Data Digest

(optional)

Protocol frames

(TCP/IP)

iSCSI PDU

Page 8: Mellanox Storage Solutions

© 2012 Mellanox Technologies 8 - Mellanox Confidential - - Mellanox Confidential - - Mellanox Confidential -

11_2012_v4

iSCSI Mapping to iSER / RDMA Transport

iSER eliminates the bottlenecks through:

• Zero copy using RDMA

• CRC calculated by hardware

• Work with message boundaries instead of streams

• Transport protocol implemented in hardware (minimal CPU cycles per IO)

Enabling unparalleled performance

BHS AHS HD Data DD

Protocol frames

(RDMA)

iSCSI PDU

RC Send RC RDMA Read/Write

X In HW

X In HW

Page 9: Mellanox Storage Solutions

© 2012 Mellanox Technologies 9 - Mellanox Confidential - - Mellanox Confidential - - Mellanox Confidential -

11_2012_v4

iSER protocol overview (Read)

SCSI Reads

• Initiator Send Command PDU (Protocol data unit) to Target

• Target return data using RDMA Write

• Target send Response PDU back when completed transaction

• Initiator receives Response and complete SCSI operation

iSC

SI

Init

iato

r

iSE

R

HC

A

HC

A

iSE

R T

arg

et

Targ

et

Sto

rage

Send_Control (SCSI

Read Cmd)

RDMA Write for

Data

Send_Control + Buffer

advertisement Control_Notify

Data_Put

(Data-In PDU)

for Read

Control_Notify Send_Control (SCSI

Response)

Page 10: Mellanox Storage Solutions

© 2012 Mellanox Technologies 10 - Mellanox Confidential - - Mellanox Confidential - - Mellanox Confidential -

11_2012_v4

Mellanox Unbeatable Storage Performance

@ 2300K IOPs

5-10% the latency under 20x the workload

0

1000

2000

3000

4000

5000

6000

7000

8000

9000

10000

iSCSI/TCP iSCSI/RDMA

IO L

ate

nc

y @

4K

IO

[

mic

se

c]

We Deliver Significantly Faster IO Rates and Lower Access Times !

@ only 131K

IOPs

iSCSI (TCP/IP)1 x FC 8 Gb

port4 x FC 8 Gb

portiSER 1 x

40GbE/IB Port

iSER 2 x40GbE/IB Port(+Acceleration)

KIOPs 130 200 800 1100 2300

0

500

1000

1500

2000

2500

K IO

Ps

@ 4

K I

O S

ize

Page 11: Mellanox Storage Solutions

© 2012 Mellanox Technologies 11 - Mellanox Confidential - - Mellanox Confidential - - Mellanox Confidential -

11_2012_v4

Accelerating IO Performance (Accessing a Single LUN)

0

1000

2000

3000

4000

5000

6000

1 2 4 8 16 32 64 128 256

Ban

dw

idth

[M

B/s

]

IO Size [KB]

iSCSI/TCP

iSER std

iSER BD (bypass)

0

100

200

300

400

500

600

700

1 2 4 8 16 32 64 128 256

IOP

s [

K/s

]

IO Size [KB]

iSCSI/TCP

iSER std

iSER BD (bypass)

0

500

1000

1500

2000

2500

3000

3500

iSCSI/TCP iSER std iSER BD (bypass)

IO L

ate

ncy

[

mic

rose

cou

nd

s]

Read

Write

R/W

0

2

4

6

8

10

12

14

16

18

iSCSI/TCP iSER std iSER BD (bypass)C

PU

Io W

ait

in

%

Read

Write

R/W

IO Latency and % of CPU in I/O Wait @ 4KB IO size and max IOPs Bandwidth & IOPs, Single LUN, 3 Threads

PCIe

Limit

PCIe Limit

@ only 80K IOPs

@ 624K IOPs

@ only 80K IOPs

@ 624K IOPs

@ 235K IOPs

@ 235K IOPs

Page 12: Mellanox Storage Solutions

© 2012 Mellanox Technologies 12 - Mellanox Confidential - - Mellanox Confidential - - Mellanox Confidential -

11_2012_v4

Primary Secondary

Mellanox & LSI address the critical storage latency and IOPs

bottlenecks in Virtual Desktop Infrastructure (VDI) • LSI Nytro MegaRAID accelerate disk access through SSD based caching

• Mellanox ConnectX®-3 10/40GbE Adapter with RDMA Accelerate access from

Hypervisors to fast shared storage over Ethernet, and enable Zero-overhead replication

When tested with Login VSI’s VDI load generator, the solution

delivered unprecedented VM density of 150 VMs per ESX server • Using iSCSI/RDMA (iSER) enabled 2.5x more VMs compared to using iSCSI with

TCP/IP over the exact same setup

Mellanox & LSI Accelerate VDI, Enable 2.5x More VMs

iSCSI/RDMA (iSER) target

Software RAID (MD)

LSI Caching Flash/RAID Controller

0 20 40 60 80 100 120 140 160

Intel 10GbE, iSCSI/TCP

ConnectX3 10GbE, iSCSI/RDMA (iSER)

ConnectX3 40GbE, iSCSI/RDMA (iSER)

Number of Virtual Desktop VMs

iSCSI/RDMA (iSER) target

LSI Caching Flash/RAID Controller

Rep

lica

tio

n

Mellanox SX1012

10/40GbE Switch

Benchmark Configuration

Redundant Storage Cluster

• 2 x Xeon E5-2650

processors

• Mellanox ConnectX®-3

Pro, 40GbE/RoCE

• LSI Nytro MegaRAID

NMR 8110-4i

2.5x More VMs

Page 13: Mellanox Storage Solutions

© 2012 Mellanox Technologies 13 - Mellanox Confidential - - Mellanox Confidential - - Mellanox Confidential -

11_2012_v4

Using OpenStack Built-in components and management (Open-iSCSI, tgt target, Cinder), no

additional software is required, RDMA is already inbox and used by our OpenStack customers !

Mellanox enable faster performance, with much lower CPU%

Next step is to bypass Hypervisor layers, and add NAS & Object storage

Native Integration Into OpenStack Cinder

Hypervisor (KVM)

OS

VM

OS

VM

OS

VM

Adapter

Open-iSCSI w iSER

Compute Servers

Switching Fabric

iSCSI/iSER Target (tgt)

Adapter Local Disks

RDMA Cache

Storage Servers

OpenStack (Cinder)

Using RDMA

to accelerate

iSCSI storage

0

1000

2000

3000

4000

5000

6000

7000

1 2 4 8 16 32 64 128 256

Ban

dw

idth

[M

B/s

]

I/O Size [KB]

iSER 4 VMs Write

iSER 8 VMs Write

iSER 16 VMs Write

iSCSI Write 8 vms

iSCSI Write 16 VMs

PCIe Limit

6X