Post on 25-Jan-2017
New performance benchmarks over 40 billion rowsBill Maimone, Head of Engineering -- MapDMazhar Memon, CTO -- BitfusionJerry Gutierrez, Global HPC Sales Leader -- IBM Cloud
2© 2016 IBM Corporation
IBM Cloud/SoftLayer Key Differentiators for GPU Accelerated Computing
Virtual/Bare Metal Servers with Hourly or Monthly Billing The Latest Intel CPUs and NVIDIA GPUs On-demand provisioning Triple-network architecture Private network only server deployments, private VLAN Un-metered private network Flash Based NetApp Performance/Endurance Storage Enterprise Grade Encryption
3© 2016 IBM Corporation
IBM Cloud -SoftLayerGlobal reach with local presence
Data centers near every major metro area enabling low-latency connectivity to cloud infrastructure.
4© 2016 IBM Corporation
Hourly GPU Servers Now Available!
MapD
• Analyzing increasingly massive datasets is critical• Ability to scale past a single node• Need access to the latest GPUs• Did not want to own or build infrastructure• Worked with IBM Cloud and very quickly came up
with a compelling solution
5
The data explosion is just beginning
6
Source: IDC and EMC Digital Universe
Report
Confidential & Proprietary
MapD
7
MapD Analytic DatabaseSQL-based column storeWritten ground-up for GPU
MapD ImmerseReact.js/d3 charts & dashboardsGPU rendering where it matters
https://www.mapd.com/blog/2016/06/27/crushing-the-billion-row-taxi-data-benchmark/
The Dataset & Queries
Confidential & Proprietary 8
1. Query Id is Q001 : query is 'select count(*) from flights2’2. Query Id is Q002 : query is 'select carrier_name, count(*)
from flights2 group by carrier_name’3. Query Id is Q003 : query is 'select carrier_name,
avg(arrdelay) from flights2 group by carrier_name'
US Flight Data from 1987 to 2008. Total dataset is 128M rows and was replicated 312 times.
9
Querying 40 billion rows in milliseconds.
Node
GPUGPUGPUGPU
GPU capacity limited by the node
Node
GPUGPUGPUGPU
Node
GPUGPUGPUGPU
Node
GPUGPUGPUGPU
Node
GPUGPUGPUGPU
Node
GPUGPUGPUGPU
Node
GPUGPUGPUGPU
Node
GPUGPUGPUGPU
Node
GPUGPUGPUGPU
Existing methods add maintenance and development costs
Bitfusion Boost: GPU Remote Virtualization
Hardware
VM Hypervisor
Drivers
Operating system
SDI
User Space
Hardware
VM Hypervisor
Drivers
Operating system
SDI
Hardware
VM Hypervisor
Drivers
Operating system
SDI
Open APIs
Custom APIs
Libraries
Application
Core Functions
Hardware
VM Hypervisor
Drivers
Operating system
SDI
• Binary-level API interception
• Distribute work across local and remote machines
application
remote servers
local server
System view
data and compute
pipelining
Advanced caching and data
directories
Auto service discovery, metering
Function redirection for advanced coprocessors
Supports the latest CUDA features including unified memory
Logi
cally
atta
ched
GPU
s
Virtually attached GPUs
CPU-only Node
48 Cores3 TB Memory
72 TB SSD Storage
BoostMassive Virtual NodeGPUGPUGPUGPU
GPUGPUGPUGPU
GPUGPUGPUGPU
GPUGPUGPUGPU
GPUGPUGPUGPU
GPUGPUGPUGPU
GPUGPUGPUGPU
GPUGPUGPUGPU
GPUGPUGPUGPU
GPUGPUGPUGPU
GPUGPUGPUGPU
GPUGPUGPUGPU
GPUGPUGPUGPU
GPUGPUGPUGPU
GPUGPUGPUGPU
GPUGPUGPUGPU
Racks with GPUs
GPU GPUGPU GPU
GPU GPUGPU GPU
GPU GPUGPU GPU
GPU GPUGPU GPU
GPU GPUGPU GPU
Creating the largest virtual GPU machines on demand
13
Unprecedented Speed at Scale
• 40 billions rows on 'select carrier_name, count(*) from flights2 group by carrier_name’ in 271ms
• 147 billion records scanned per second• 8X the number of records scanned previously
Combining: GPU-accelerated database + GPU Virtualization + Optimized CloudFor fastest database query times
14
App Specific Instance Configurations as Machine
Images
Resource Pooling:• Consolidate use of compute resources• Increase utilization• Lower capital costs
Resource Provisioning:• Enforce CPU, memory, utilization quotas• Effect QoS policy and guarantees• Maximize utilization and reduce costs
High availability:• Detect failures at app level• Rollback, failover, error detection• Events for higher level reporting
Heterogeneous Offload:• Leverage HPC hardware• Interpose vendor libraries• Retarget hot functions to efficient specialized devices
Scale-out:• Distribute and load balance load across systems• Scale performance on demand• Take advantage of runtime optimizations
Advanced Profiling:• Understand application
demands of the datacenter• Fine-grained data provides
unique insight• Precise recommendations for
capacity planning
Deep Learning Caffe Deep Learning Torch
Deep Learning TensorflowMedia Transcoding
Rendering Scientific Computing
Boost: Adding a broad set of GPU features to your application
15
In Summary
• Enable powerful GPU super nodes with Bitfusion Boost
• 60 days of collaboration with IBM and just a week to integrate
• Unprecedented database performance with MapD
16© 2016 IBM Corporation
Q & AJerry GutierrezGlobal HPC Sales Leaderjegutierrez@us.ibm.comwww.softlayer.com/gpu
Bill MaimoneMapD VP Engineeringbill@mapd.comwww.mapd.com
Mazhar MemonCTO Bitfusionmazhar@bitfusion.iowww.bitfusion.io