Post on 16-Apr-2017
Gluster: Where We've Been
AB PeriasamyOffice of the CTO, Red Hat
John Mark WalkerGluster Community Guy
Topics
The Big Idea
Humble beginningsFrom Bangalore to Milpitas
Scale-out + Open source == WINNINGUser-space, no metadata server, stackable
Cloud and commoditization
A Data Explosion!
74% == Unstructured data annual growth
63,000 PB == Scale-out storage in 2015
40% == storage-related expense for cloud
44x == Unstructured data volume growth by 2020
Bengaluru Office
Conference Room
US Head Office
Bengaluru Office
Gluster Community Deployments
Gluster Production Deployments
What Can You Store? Media Docs, Photos, Video
VM Filesystem VM Disk Images
Big Data Log Files, RFID Data
Objects Long Tail Data
The big idea:Storage should be simple
Simple, scalable, low-cost
Add examples where complexity has been bad- EMC, Cisco, Brocade et al. certification made business out of complexity- if too complicated, doesn't scale
What is GlusterFS, Really?
Gluster is a unified, distributed storage systemDHT, stackable, POSIX, Swift, HDFS
Discuss approach how GlusterFS is unique and different from other approaches
- Lessons form GNU Hurd- user space distributed storage operating system- overcome some parts of the OS: implemented scheduler, POSIX locking, RDMA, MM, cf. JVM, python, etc.- no metadata separation
Phase 1: Lego Kit for StoragePeople who think that userspace filesystems are realistic for anything but toys are just misguided" Linus Torvalds
Goal: create a global namespace
If you have a bunch of files, should be as simple as an FTP server - in user space, required FUSE, POSIX translator, NAS protocol, cluster translator
volume testvol-posix type storage/posix option directory /media/datastore option volume-id 329e31c1-04cc-4386-8bb8-xxxxend-volume
volume testvol-access-control type features/access-control subvolumes testvol-posixend-volume
volume testvol-locks type features/locks subvolumes testvol-access-controlend-volume
volume testvol-io-threads type performance/io-threads subvolumes testvol-locksend-volume
Versions 1.x 2.x
Hand-crafted volume definition filesSee examples
Simple configuration files
Faster than tape? It's good!
Phase 2: Repeatability of Use Cases
Community-led
Learned from community Desired features
Usage profiles
All about scalable storage of unstructured data
Learned about missing features
Found the largest problem and wanted to solve it- patterns emerged- scalable unstructured data storage was the #1 problem people wanted to solve
Had a clearer idea where we wanted to go clear direction
GlusterFS 3.0: Putting it all together
Adding, removing features
Templates recipes for common use cases
Standalone NFS replacementActive-active replicated storageScalable, distributed storage
..
And then scalable, replicated distributed storage
+ other combos
GlusterFS 3.1 - 2010
Elasticity: add and remove volumes w/ glusterd
Automation: CLI, scriptable
Elastic features driven by cloud and virt usage- shared storage for virtual guests- flexible, self-service storage- elastic volume management became requirement- automated provisioning of storage w/ CLI
(native NFS server? Or 3.2?)
CLI Magic
$ gluster peer probe HOSTNAME$ gluster volume info$ gluster volume create VOLNAME [stripe COUNT] \ [replica COUNT] [transport tcp | rdma] BRICK $ gluster volume delete VOLNAME$ gluster volume add-brick VOLNAME NEW-BRICK ...$ gluster volume rebalance VOLNAME start
GlusterFS 3.2 - 2011
Native NFS server
Marker framework
Geo-replicationAsynchronous
Marker famework:- story of why it's necessary- backup of data in other locales- don't need entire snapshot- users wanted to continuous, unlimited replication- don't want sysadmin intervention on-demand- queries FS to find what files have changed- manages queue, telling rsync exactly which files to change
Inotify doesn't scale, if daemon crashes, stops tracking changes- would have to write journaling feature to maintain change queue
Geo-replication can work on high-latency, flaky networks
And now for something completely differentCommoditization and the changing economics of storage
Why we're winning
Simple EconomicsSimplicity, scalability, less cost
Multi-Tenant
Virtualized
Automated
Commoditized
Scale on Demand
In the Cloud
Scale Out
Open Source
Simplicity Bias
FC, FCoE, iSCSI HTTP, Sockets
Modified BSD OS Linux / User Space / C, Python & Java
Appliance based Application based
Scale-out Open Source is the winner
Thank you!
AB PeriasamyOffice of the CTO, Red Hatab@redhat.com
John Mark WalkerGluster Community Guyjohnmark@redhat.com