Architecting An Enterprise Storage Platform Using Object Stores
-
Upload
niraj-tolia -
Category
Technology
-
view
799 -
download
1
description
Transcript of Architecting An Enterprise Storage Platform Using Object Stores
Architecting an Enterprise Storage
Platform Using Object Stores
© mekuria getinet / www.mekuriageti.net
Niraj Tolia
Chief Architect, Maginatics
@nirajtolia
A Whirlwind Tour
Awesome Questions == Awesome T-shirts
80% YoY Growth in Unstructured Data
41% Growth in IaaSSystems through 2016
Sources:
Gartner, IT Marketing Clock for Storage, Sep 2011
Gartner, Forecast Overview: Public Cloud Services, Worldwide, 2011-2016, Feb 2013
MagFS –The File System for the Cloud
Consistent, Elastic, Secure, Mobile-Enabled
Layered on Object Stores
“Software-Defined”
No (Initial) Legacy Support (NFS/CIFS)
Native Clients: Push Intelligence to Edges
Strong Consistency w/ Full-Spectrum Caching
File System Design Goals
Low Cost, High Scale
Intelligent Clients
Span Devices and Networks
Support Rapid Iteration
In-CloudFile System
NAS Replacement and Consolidation
Enterprise File Sharing
Use Cases
Object Storage(public, on-premises, or hybrid)
Data
Metadata
Metadata Servers
Clients
10,000 Foot View
Koukouvaya / flickr.com/photos/jackoughton/6535137981/
Heavy (Data) Lifting via Clients
Encryption
Inline Deduplication
Compression
Persistent Data Caching
Bulk Data Transfers
Cloud Object Storage
Scale Out, Low Cost
Handles Placement + Replication
Tolerates Failures
High Aggregate Performance
Virtualized Metadata Servers
Enforce Strong Consistency
Enforce Authentication and Integrity
Runtime Performance Optimization
Share-level Deduplication
Data Scrubbing & Garbage Collection
Architecture
Client
Architecture
Client Architecture
Application
Redirector
(e.g., FUSE)
File System
OS Glue
Data Manager
Metadata Transport
Layer
Local Remote
Userspace
Kernel
Deduplication Encryption Compression
Locking Leases
Data Manager
File System Layer
Simplified Write: Deduplication + Encryption
Write Request
Plaintext
Variable-Length
Chunking
Encrypted Text (E)
AES-256 (K)
Object Name (N)SHA-256
Local Cache Remote Transfer
Encryption Key (K)SHA-256
Data Manager
File System Layer
Simplified Write: Deduplication + Encryption
Write Request
Plaintext
Variable-Length
Chunking
Encrypted Text (E)
AES-256 (K)
Object Name (N)SHA-256
<File, Offset, N, K>
Optional(<URI>)Local Cache Remote Transfer
<N, E>
<URI, E>
No Encryption Keys
in the Cloud
No Encryption Keys
in Local Cache
Encryption Key (K)SHA-256
<E>
Data Manager
File System Layer
Simplified Read: Deduplication + Encryption
Read Request
<File, Offset, Range>
Local Cache Remote Transfer
<N, URI>
Encryption Key (K)
<N, K, URI>
Encrypted Text (E)
<E>
<URI>
<E>
<URI>
<E>
Plaintext
AES-256 (K)
The Client in Real Life Does a Lot More!
• File and Directory Leases (data and metadata caching)
• Asynchronous Operations (including writes)
• Operation Compounding
• Runtime Optimizations (e.g., read ahead)
• Optimizing for High Bandwidth Delay Product (BDP)
• …
Object Storage(public, on-premises, or hybrid)
Data
Metadata
Metadata Servers
Clients
Communication Details
Thrift
(HTTPS)
REST
(HTTPS)
Server
Architecture
Metadata Server Internals
Metadata Storage Layer
Storage Core
Backups
Production Development
GC
Scrubbing
Quotas Dedup Leases Security
HA
MagFS
Ext. Sharing
Multi-Cloud Versioning Offline Mode
Cloud Abstraction Layer
Legend
Bootstrapping: Virtualized Namespaces
\\server.example.com\share
HOST FQDN FOLDER
Legacy
\\server.example.com\shareMagFS
Dynamic mapping to host:port
Discovery Service
Metadata
Server
Metadata
Server (HA)
Metadata
Server
ZooKeeper
ZooKeeperZooKeeper
MonitoringManagement
Console
Config +
Scheduler
Virtual Filer Host:Port Mapping
Leases: Performance and Strong Consistency
Read Write HandleLease Types
ReadRead + Handle
Read + Write + Handle
Lease States
Valid File Leases
Valid Directory Leases
Cloud Storage
Interaction
Object Storage(public, on-premises, or hybrid)
Object Storage systems
are like snowflakes!
Object Store API Compatibility
Q: Has anyone come across a near 100%
Amazon S3 API compatible object storage
system?
A: It is hard to find a near-100% compatible
product…
- Vendor w/ S3 Compatible Product
Object Storage(public, on-premises, or hybrid)
Data
Metadata
Metadata Servers
Clients
Direct Client Access: Security Problem?
Request Signing
Server-Driven Request Signing
SignString = HTTP-Verb + "\n"
+ Content-MD5 + "\n"
+ Content-Type + "\n"
+ Date + "\n"
+ Resource + "\n"
+ ...
Server-Driven Request Signing
SignString = PUT + "\n"
+ Content-MD5 + "\n"
+ Content-Type + "\n"
+ Date + "\n"
+ Resource + "\n"
+ ...
Server-Driven Request Signing
SignString = PUT + "\n"
+ 07BzhNET7exJ6qYjitX/AA== + "\n"
+ Content-Type + "\n"
+ Date + "\n"
+ Resource + "\n"
+ ...
Server-Driven Request Signing
SignString = PUT + "\n"
+ 07BzhNET7exJ6qYjitX/AA== + "\n"
+ image/jpeg + "\n"
+ Date + "\n"
+ Resource + "\n"
+ ...
Server-Driven Request Signing
SignString = PUT + "\n"
+ 07BzhNET7exJ6qYjitX/AA== + "\n"
+ image/jpeg + "\n"
+ Tue, 11 Jun 2013 00:27:41 + "\n"
+ Resource + "\n"
+ ...
Server-Driven Request Signing
SignString = PUT + "\n"
+ 07BzhNET7exJ6qYjitX/AA== + "\n"
+ image/jpeg + "\n"
+ Tue, 11 Jun 2013 00:27:41 + "\n"
+ /container/example.jpeg + "\n"
+ ...
Server-Driven Request Signing
SignString = PUT + "\n"
+ 07BzhNET7exJ6qYjitX/AA== + "\n"
+ image/jpeg + "\n"
+ Tue, 11 Jun 2013 00:27:41 + "\n"
+ /container/example.jpeg + "\n"
+ ...
HMAC-SHA1( , SignString)
Server-Driven Request Signing
SignString = PUT + "\n"
+ 07BzhNET7exJ6qYjitX/AA== + "\n"
+ image/jpeg + "\n"
+ Tue, 11 Jun 2013 00:27:41 + "\n"
+ /container/example.jpeg + "\n"
+ ...
Signature = Base64(HMAC-SHA1( , SignString))
Object Storage(public, on-premises, or hybrid)
Data
Metadata
Metadata Servers
Clients
Safe Direct Client Access via Request Signing
1. Read/Write Request
3. HTTP Request +
Signature +
Encrypted Data
2. HTTP Request + Signature
Dealing with Lost Client Writes
• Clients can lose connectivity or, in the worst case, be malicious
• Naïvely trusting client writes can “corrupt” w/ global dedup
• MagFS server scrubs all writes:• Client acknowledges write
• Server verifies object existence (object store performed MD5 at PUT)
• Server can also read and verify object data (stronger SHA-256 check)
• The object will be available for deduplication only after scrubbing
Handling Object Store Eventual Consistency
• Treat objects as immutable (even if modifications are allowed)
• Use content-based names (generated using cryptographic hashes)
• Tombstone names after Garbage Collection• Suffix generation number to content-based names in case of resurrection
Security
Architecture
Recap: On-Premises Security Model
• User authentication and permissions derived from native Active Directory setup
• Encryption keys are never exposed to the cloud
• Data and metadata is always encrypted: At-Rest and In-Flight
Slides (with speaker notes) at http://tolia.org
Try MagFS at http://maginatics.com