AWS re:Invent 2016 - Scality's Open Source AWS S3 Server

24
AWS re:Invent 2016 Scality’s Open Source AWS S3 Server Giorgio Regni, CTO @GiorgioRegni

Transcript of AWS re:Invent 2016 - Scality's Open Source AWS S3 Server

Page 1: AWS re:Invent 2016 - Scality's Open Source AWS S3 Server

AWS re:Invent 2016Scality’s Open Source AWS S3 Server

Giorgio Regni, CTO@GiorgioRegni

Page 2: AWS re:Invent 2016 - Scality's Open Source AWS S3 Server

Cloud Field Days #1 © Scality 20162

Scality RING: Automatize storage for Digital Business

The Scality RING is object-based software-defined storage for the cloud.

We run on standard x86 servers and create a giant pool of storage.

We protect the data and provide 100% reliable, high performance access for any capacity-driven application.

FILE OBJECT OPENSTACK

Page 3: AWS re:Invent 2016 - Scality's Open Source AWS S3 Server

Disrupting storage – unlimited & everywhere

Page 4: AWS re:Invent 2016 - Scality's Open Source AWS S3 Server

© Scality 20164

Open source object storage serverhttps://github.com/scality/s3

Written in Node.js Single instance running in a docker

container Uses docker volumes for persistent storage Same code as Scality’s RING S3 interface

What is Scality AWS S3 Server?

Page 5: AWS re:Invent 2016 - Scality's Open Source AWS S3 Server

© Scality 20165

Scality AWS S3 server released under an Apache-2.0 license in July ->>15K downloads on docker hub!

5

Page 6: AWS re:Invent 2016 - Scality's Open Source AWS S3 Server

6

Customer & partner push for AWS S3 has swelled in last 18 months

AWS S3 has become de-facto interface standard

Growing demand

AWS S3 Adoption keeps rising

Page 7: AWS re:Invent 2016 - Scality's Open Source AWS S3 Server

Why you should use AWS S3?

Page 8: AWS re:Invent 2016 - Scality's Open Source AWS S3 Server

8

Object vs File

Object/S3:• Flat Data Model with collections called

“buckets”• Objects are written and overwritten

not byte-wise modified• Scales to billions of objects

File/Posix:• Hierarchical Data Model• Files are randomly writable in byte-

wise fashion• Scales to hundreds of thousands

files

Page 9: AWS re:Invent 2016 - Scality's Open Source AWS S3 Server

9

Why object scales better

Separation of Metadata and Data• Bucket listing and object locations are

stored separately from the Data• Objects can be spread out anywhere• Direct access to data - no need to

traverse a tree structure

Shared nothing:• Consistency rules only apply to one

object or bucket at a time• Clients are stateless - Last writer wins • No relationships exist between

objectsData Data Data

MD MD MD

Data…

SDS/Object Service

Page 10: AWS re:Invent 2016 - Scality's Open Source AWS S3 Server

© Scality 2016

Scality AWS S3 Server achitecture

Protocol StackBuckets, Objects, MPU (REST API)

AUTH BUCKETS DATA

LevelDB*

Scality RING v6

Kinetic IP drives

Docker Volume

* Persisted in Docker Volume

Scality RING v6Vault

LevelDB*

S3 API

S3 Server

Public Cloud

Page 11: AWS re:Invent 2016 - Scality's Open Source AWS S3 Server

© Scality 201612

Developers can install and develop S3-based apps locally

Enterprises can host a local test/dev environment to learn about object storage

Enterprises can host a small, local object storage system in production

Scality AWS S3 Server: From 0 to S3 in Under 5 Minutes

S3 Server

S3 Server

S3 Server

Backup Application

S3

Page 12: AWS re:Invent 2016 - Scality's Open Source AWS S3 Server

Steps:1. Launch Kitematic UI to access Docker Hub2. Pull the S3 Server container3. Start the container 4. Use the Cyberduck UI to create Buckets, PUT, GET & DELETE Objects

Quick, Live AWS S3 Server Demo!

Page 13: AWS re:Invent 2016 - Scality's Open Source AWS S3 Server

© Scality 201614

Why it matters Supports many simultaneous

clients to the same bucket, even across sites

Simplifies access Linear performance scaling

AWS S3 Server for RING - Scale-out & Performance

Scalable Bucket Namespace with simple access point Scales-out in simple uniform building blocks Optimized for low latency, high bandwidth, and fast listing Multi-site deployment preserves availability during site or

network failure (initial support for two sites)

Scalable S3 Bucket Namespace: any-to-any access

APP A

APP B

APP D

APP C

Page 14: AWS re:Invent 2016 - Scality's Open Source AWS S3 Server

Challenges & how to overcome them

Scality Open Source Amazon S3

Page 15: AWS re:Invent 2016 - Scality's Open Source AWS S3 Server

Logging is hard

• Challenges• Logging is expensive as it taxes the Node.js process• UDP datagrams have expensive DNS lookups• Redundant transformations by bunyan and bunyan-logstash

• Solution: Werelogs• Produces raw JSON logs with the least resistive path• Forward logs to ELK using Filebeat for indexing• Avoids expensive and redundant transformations• Ability to track requests across the components with UIDs• Dump log history on errors

Open source -> http://github.com/scality/werelogs

Page 16: AWS re:Invent 2016 - Scality's Open Source AWS S3 Server

Performance, performance & performance

Page 17: AWS re:Invent 2016 - Scality's Open Source AWS S3 Server

The performance cycleCode, Benchmark, … Repeat

• Socket & Nagle algorithm on by default -> very high latencies

• The event loop can get backed up quickly -> hunt for all cpu intensive tasks in the main loop

• Buffers are much more efficient when writing server response

• Micro optimizations: Date.now() > new Date()• Beware of libraries doing way too many things for you• ES6 support, Babel5 was killing performance -> Babel6

Page 18: AWS re:Invent 2016 - Scality's Open Source AWS S3 Server
Page 19: AWS re:Invent 2016 - Scality's Open Source AWS S3 Server

Download Scality AWS S3 Server!

http://s3.scality.com/& Come to our booth #420

Page 20: AWS re:Invent 2016 - Scality's Open Source AWS S3 Server

Customer use cases

Scality Open Source Amazon S3

Page 21: AWS re:Invent 2016 - Scality's Open Source AWS S3 Server

© Scality 2016

VOD/Live Streaming: Deluxe

RING was deployed as an origin server to store and distribute all of Deluxe OnDemand’s transcoded titles (including Theatrical and TV) to large cable providers, retailers, and end consumers.

The initial deployment was 1.5PB, which the customer expects to add 10-15PB for different applications.

Highest valued features: • Performance – serves video to millions of users at 100Gb/s. • Hardware-freedom – allows for competitive hardware pricing,

and utilization of newer disks and hardware over time to stay in the sweet spot of cost.

• Scalability – meet 10X expansion with no data migration

About the customer:Deluxe OnDemand is a cloud-based multiscreen VOD catalog service that simplifies the access and delivery of content to any device.

What was the challenge?Customer needed to transcode and store up to 10TB of content daily. Needed a platform that could scale to petabytes and remain cost-effective.

Video Testimonial: https://vimeo.com/134065438

Page 22: AWS re:Invent 2016 - Scality's Open Source AWS S3 Server

© Scality 2016

Web & Cloud: Daisy Group (fka Phoenix IT)

GEO-stretched RING across 3 UK-based data centers. Highly resilient configuration. Running on HPE SL4540 servers with DL360 servers used for connectors.

Highest valued features:– Reliability – Reduce risk with data resiliency– Scalability – Easily grow to 1 PB and beyond– Application support – Integration with archiving app.– Future proof – Can use the same platform to host other

services

Video Testimonial: https://vimeo.com/131127269

About the customer:Leading provider of business continuity and managed services in the UKWhat was the challenge?Launching new Archive as a Service. Needed the solution to be significantly lower cost than existing infrastructure

Page 23: AWS re:Invent 2016 - Scality's Open Source AWS S3 Server

© Scality 2016

Long Term Archive: AB TV France

AB TV deployed a HP SL4540-based 700TB active/passive RING at two fiber-connected data centers. Initially, they migrated their 700TB LTO archive to the RING and recently added 300TB on both sites to grow to a PB. They are currently using SGT for Media Asset Management.

Highest valued features: • Geo-distribution and 100% reliability – second site for

disaster recovery ensures data availability• Performance – Superior to LTO-based access performance;

Simplified management and automation.• Scalability – ability to meet future storage needs

About the customer:The AB Group, a French broadcasting company, produces content for both TV and the Web. The company owns 14 French language channels such as AB Moteurs, RTL9 and Ciné FX.

What was the challenge?First, AB TV needed to migrate its tape-based assets. In addition, they were frustrated with their hardware dependent RAID system.

Page 24: AWS re:Invent 2016 - Scality's Open Source AWS S3 Server

Download Scality AWS S3 Server!

http://s3.scality.com/& Come to our booth #420