Deliver Best-in-Class HPC Cloud Solutions Without Losing Your Mind
-
Upload
avere-systems -
Category
Technology
-
view
220 -
download
0
Transcript of Deliver Best-in-Class HPC Cloud Solutions Without Losing Your Mind
Deliver Best-in-Class HPC Cloud Solutions Without Losing Your Mind
WEB
INA
R
April 13, 2016, 11:00 AM ET
Housekeeping
• Audio help• Attachments• Questions• Rating
Today’s Speakers
Rick FriedmanVice President, Solution
DevelopmentCycle Computing
Scott JeschonekDirector of Product
Management, Cloud Avere Systems
Agenda
• Discuss the current state of HPC • Clouds and their impact on your HPC world• Reasons why you aren’t 100% cloud-based already• The Hybrid Cloud and HPC• Possible implementations • Delivering File Systems using Avere Systems• Orchestration using Cycle Computing
HPC Today (and Yesterday, and Tomorrow)
What Drives Today’s Needs
• Data– Who, what, when, how much, where?
• Datacenter limitations– Can I defy physics?
• User expectations– Can we even do that?
• Technology shifts– What is the “best practice”?
Big Compute Workloads: How are they handled?
Compute Demand vs. Cluster Size
Cluster Size
Compute Demand
Missed Opportunity
Wasted Resources
• Internal infrastructure has huge value and some limitations
• Access, not capacity, is the barrier to continued growth
• Perception limits scale of problem solving
• Public cloud = cost-effective, readily available resources to users with problems & deadlines.
• Financial services, manufacturing and life sciences are leading the way.
Basic HPC Environment Requirements
Resource Manager
Jobs Manager / Scheduler
Workload
NAS Storage
Lots of compute resources (“Grid”)
Advantages of Clouds
Significantly reduce infrastructure management costs both in money and time
Maintain operational flexibility during scale-out jobs…let the provider deal with scale challenges
Why the Cloud for Big Compute?
• Scientist / Engineer User perspective– Zero queue times, capacity in minutes– Scale compute to problems size, not vice versa– Try / support new computational approaches and software quickly
• SysArchitect perspective– Dynamically adjust workloads to “lowest cost/impact” provider– Focus on computational excellence, not hardware management– Support a wide range of user types efficiently
• Organizational perspective– Match spending to actual consumption– Increase responsiveness to business dynamics– Grow user base without hardware limitations
Clouds Have Awesome New Capabilities
• Big Data– Analytics Tools– Massively scalable NoSQL– Data warehousing
• Machine Learning– Voice/Vision/Speech– Early days
So…why isn’t everything in the cloud?
• Current infrastructure investment (capex)• Cloud costs not yet completely in line • Software infrastructure in place
– Costs to refactor, dependencies to consider• Data environment in one or more data centers• Orchestration and management of cloud clusters is hard• Network bandwidth / latency concerns• Business Continuity
Other Reasons You’re Not 100% in the Cloud
• Corporate budgets• Corporate policies• Corporate politics• Education / awareness• Government regulations• Interest groups• Vendor relationships
Near Future, Hybrid Cloud
Tokyo office London officeAnalysts
Analysts
NYC officeAnalysts AnalystsAnalysts
Analysts AnalystsAnalysts
AnalystsAnalysts
Hong Kong office
• Adoption of one or more cloud providers• > 1 hedge on price and SLA
• Mix of on-prem and cloud resources• Regulatory, proprietary and/or security
characteristics will likely keep data in the DC
NAS
Primary DC
Cloud Provider
1
Cloud Provider
2
NAS
Secondary DCSubmit Jobs
Submit Jobs
Cloud ComputeEnvironment
Data
HPC in the Cloud
Cloud Compute API
Scheduler
NAS Storage
Analysts
Scheduler
AnalystsAnalysts Analysts Analysts Analysts
JobsOn-Premises Data Center
Cloud ComputeEnvironment
HPC in the Cloud, “Grids on Demand”
Cloud Compute API
Data
NAS Storage
Analysts
Scheduler
AnalystsAnalysts Analysts Analysts Analysts
JobsOn-Premises Data Center
Scheduler1 Scheduler2Scheduler3 Scheduler4
Challenges with HPC in the Cloud
• How do you get the data close to your compute nodes?
• How do you orchestrate on-demand clusters/grids of compute nodes?
• How does this all come together??
Cloud ComputeEnvironment
Data Access Layer
Cloud Compute API
Scheduler1
Data
NAS Storage
Analysts
Scheduler
AnalystsAnalysts Analysts Analysts Analysts
JobsOn-Premises Data Center
Data Access Layer
Scheduler2 Scheduler3 Scheduler4
• File System• Caching Layer• Only load necessary
blocks of files• Opaque to compute
nodes
Advantages of Data Access / Cache Layer
• Keep your data on prem! – Data in cloud is only there while the compute nodes work the jobs. – Reduce the security objections, simplify the move to cloud
• Increase cloud compute performance – using file system caching, most of the data will be in RAM, close to the nodes– Avoids ingest latencies and slashes transit latency after first read
• Scale out – Using solution that facilitates 10s of 1000s of core file system connections
Typical File Access in Hadoop Cluster
Caching files will work for certain types of jobs
Where typical file is accessedBy multiple clients
source: http://blog.cloudera.com/blog/2012/09/what-do-real-life-hadoop-workloads-look-like/
Hybrid Cloud using Avere FXT and vFXT Edge Filers
CloudComput
e
On-PremComput
e
CloudStorage
On-PremStorage
NAS
Object
Bucket 1 Bucket 2
Bucket n
Virtual Compute Farm
Virtual FXT
File Storage for
Private Object
NAS Optimization
Cloud NAS
Cloud
Bursting
Cloud
Sto
rage
Gateway
Physical FXT
The “Edge” = locating your dataClose to your computeWithout truly moving it from yourNAS environment
Avere Building Blocks
“Avere is uniquely positioned to offer scale across tens of thousands of cloud compute cores while leaving the data where it originates, on premises, with it’s global file system and caching capabilities.”
- Unnamed CTO
Cloud Compute
Virtual FXT
NAS
Object
Physical FXT
Cloud
On-Premises
File Acceleration
12-20msEncrypted
Cloud ComputeEnvironment
Orchestration and Management Layer
Cloud Compute API
Data
On-Premises Data Center
Scheduler1 Scheduler2Scheduler3 Scheduler4
NAS Storage
Analysts
Scheduler
AnalystsAnalysts Analysts Analysts Analysts
Jobs
Optimization• Benchmark instances• Make Workflow UI • Human workflow
Provisioning• Workload placement
Optimal scale• Cost optimization• Data scheduling
Cluster Configuration• Multi-cloud, without changes• Pre-set or User-defined “types”• Abstraction for all cluster data,
attributes (roles, OS, etc)
Monitoring• Auto-scaling• Usage tracking• Error Handling• Reporting
Internal
File: DeclarativeCluster Definition
Packages, InstallersContainers, Data
Admin
Scope Configure Run on Cloud Optimize
User
Complete Multi-Cloud Workflow Control
User
Web UI API
CMDLine
Job & Data Workflow
AutomatedJob Placement,
Cost optimization
Auto-scaling, Benchmarking,
Compliance, Reporting tools
Multi-cloud Without Changes
InternalCluster
How Cycle Makes Cloud Productive
• Scientist / Engineer productivity: – Simple workflows– Zero queue time– Auto-scaling
• SysAdmin productivity: – Instant access to additional resources– Workflows linking internal and multiple clouds– Simple reliable tools to enable apps with
special requirements• Organizational productivity:
– Secure, consistent cloud access– Usage tracking– Ability to leverage multiple providers
26
Big Data w/o Disrupting Production
• Challenge– Estimate the carbon stored in Saharan biomass– Rapidly establish a baseline for later research using
large amounts of high-resolution remote sensing data
– Existing internal compute resources fully committed– Limited window to complete processing
• Cycle solution– Full workflow including data management between
internal data capture and cloud processing– Leverage spot pricing to minimize cost while
maximizing computation• Results
– Linearly scalable, predictable enabling plan for next steps
– Science being done that could not be done otherwise
– 1 month start to initial runs
Overall Architecture – Data In-House
27
Cloud Compute
Scheduler
Avere FXT Edge Filer
Avere FXT
Workload
Cloud API
NAS Storage
Scheduler
Cloud Storage
What We Covered…
• The Current State of HPC • Clouds and Their Impact on Your HPC World• Reasons Why You aren’t 100% Cloud-based Already• The Hybrid Cloud and HPC• Possible Implementations • Delivering File Systems Using Avere Systems• Orchestration Using Cycle Computing
Thank you!
Cycle Computing Contact Info: More about Avere Systems:
https://twitter.com/averesystems
https://www.youtube.com/user/AvereSystems
https://www.linkedin.com/company/589037
https://twitter.com/cyclecomputing
https://www.youtube.com/user/CycleComputing
https://www.linkedin.com/company/692068