Best practices for using flash in hyperscale software storage architectures

13
Best practices for using flash in hyperscale software storage architectures Brandon Hoang Solutions Architect Flash Memory Summit 2015 Santa Clara, CA 1

Transcript of Best practices for using flash in hyperscale software storage architectures

Page 1: Best practices for using flash in hyperscale software storage architectures

Best practices for using flash in hyperscale software storage architectures

Brandon HoangSolutions Architect

Flash Memory Summit 2015Santa Clara, CA 1

Page 2: Best practices for using flash in hyperscale software storage architectures

Flash Memory Summit 2015Santa Clara, CA 2

Page 3: Best practices for using flash in hyperscale software storage architectures

Software-Defined Storage (SDS)

Flash Memory Summit 2015Santa Clara, CA 3

Software Software-Defined Storage

+ =

Double-digit growthA $2.8 billion market by 2017

Commodity Servers

IDC 2015

Page 4: Best practices for using flash in hyperscale software storage architectures

Who is Hedvig?

Flash Memory Summit 2015Santa Clara, CA 4

• Founded in 2012 by Avinash Lakshman – Co-inventor of Amazon Dynamo

and inventor of Apache Cassandra

• Develop the Hedvig Distributed Storage Platform – A software-defined storage solution

Page 5: Best practices for using flash in hyperscale software storage architectures

Anatomy of a distributed, hyperscale storage system

Each node hosts:• Metadata• Data

Flash Memory Summit 2015Santa Clara, CA 5

Hosts access storage tier via storage proxies

Scale-out storage tier with

commodity servers

Applicationtier

Page 6: Best practices for using flash in hyperscale software storage architectures

Taking advantage of flash w/ SDS

Flash Memory Summit 2015Santa Clara, CA 6

• At the storage server node– Store metadata on SSD: fast lookups and tracking– Write-optimization: sequentialize random I/O– Auto-tier and cache active data on SSDs: speed access

to hot data and buffer HDD capacity tier– Provision volumes on “all-flash” persistent storage (aka

“pin to flash”): dedicated, consistent performance for latency sensitive apps

• At the application host– Cache hot data on local SSD or PCIe flash to

accelerate access and avoid network latencies

Page 7: Best practices for using flash in hyperscale software storage architectures

Biggest flash benefit to hyperscale: Sequentializing random I/O

Flash Memory Summit 2015Santa Clara, CA 7

Application writes data in random blocks, and gets immediate ack from cluster

1

Storage cluster sequentializes incoming blocks (in RAM+SSD) into larger chunks

2

Larger sequentialized data chunks written to underlying disks according to policy

3

Hyperscale client

Hyperscale nodes

Application Server

Page 8: Best practices for using flash in hyperscale software storage architectures

Three ways flash is used in hyperscale systems

Read/write cache on storage nodes

“Pin to flash” dedicated primary storage volume Client side read cache

Page 9: Best practices for using flash in hyperscale software storage architectures

Option #1: Node OS storage

Read/write cache on storage nodes

“Pin to flash” dedicated primary storage volume Client side read cache

Type of flash:SLC/MLC SSDs or PCIe Flash

Use of flash:-- Store metadata for fastoperations – dedupe, compression, snaps, clones-- Autotiering to ensure hot data is migrated to flash-- Write logs for metadata and data

Typical configuration:2x 300GB MLC SAS/SATA SSDs

Page 10: Best practices for using flash in hyperscale software storage architectures

Option #2: Node volume storage

Read/write cache on storage nodes

“Pin to flash” dedicated primary storage volume Client side read cache

Type of flash:SLC/MLC SSDs or PCIe Flash

Use of flash:-- All-flash virtual volumesfor dedicated, consistentperformance on a per-appbasis-- Flash performance for read and write operations

Typical configuration:2x 800GB MLC SAS/SATA SSDs

Page 11: Best practices for using flash in hyperscale software storage architectures

Option #3: Client-side cache

Read/write cache on storage nodes

“Pin to flash” dedicated primary storage volume Client side read cache

Type of flash:SLC/MLC SSDs or PCIe Flash

Use of flash:-- Write-through cache to store hot blocks-- Local metadata storage

Typical configuration:800GB MLC SAS/SATA SSDs

Typical application server can deliver 65K IOPs vs. 15K

without flash

Page 12: Best practices for using flash in hyperscale software storage architectures

Results: Law Firm• Challenge:

– Needed quick, reliable indexing and lookups of massive 100 million active client legal docs

– Traditional NAS underperformed required access time – Standalone servers with flash performed well, but

predictably ran out of space• Solution/Result:

– Hedvig software-defined storage with SSD/HDD and client-side flash caching

– ~9x faster performance with flash– Scale-out architecture simplifies growth and

expansion

Flash Memory Summit 2015Santa Clara, CA 12

Page 13: Best practices for using flash in hyperscale software storage architectures

Flash Memory Summit 2015Santa Clara, CA 13

Thank you!

For more on Hedvig visit: • Web: hedviginc.com• Twitter: @hedviginc