Converging HPC and Big Data to Enable New Areas …...– Construction planned to begin October,...
Transcript of Converging HPC and Big Data to Enable New Areas …...– Construction planned to begin October,...
![Page 1: Converging HPC and Big Data to Enable New Areas …...– Construction planned to begin October, 2015 – Early User Period starting in late 2015 – Base-Level System: December 2015](https://reader033.fdocuments.in/reader033/viewer/2022050508/5f9990181d30c62fa51c8d85/html5/thumbnails/1.jpg)
Converging HPC and Big Datato Enable New Areas of Research
Nick Nystrom · Sr. Director of Research & Bridges PI · [email protected] HPC Workshop: Introduction to Bridges · March 31, 2016
© 2016 Pittsburgh Supercomputing Center
![Page 2: Converging HPC and Big Data to Enable New Areas …...– Construction planned to begin October, 2015 – Early User Period starting in late 2015 – Base-Level System: December 2015](https://reader033.fdocuments.in/reader033/viewer/2022050508/5f9990181d30c62fa51c8d85/html5/thumbnails/2.jpg)
2
The Shift to Big Data
Pan-STARRS telescopehttp://pan-starrs.ifa.hawaii.edu/public/
Genome sequencers(Wikipedia Commons)
NOAA climate modelinghttp://www.ornl.gov/info/ornlreview/v42_3_09/article02.shtml
CollectionsHorniman museum: http://www.horniman.ac.uk/get_involved/blog/bioblitz-insects-reviewed
Legacy documentsWikipedia Commons
Environmental sensors: Water temperature profiles from tagged hooded sealshttp://www.arctic.noaa.gov/report11/biodiv_whales_walrus.html
Library of Congress stackshttps://www.flickr.com/photos/danlem2001/6922113091/
VideoWikipedia Commons
Social networks and the Internet
New Emphases
![Page 3: Converging HPC and Big Data to Enable New Areas …...– Construction planned to begin October, 2015 – Early User Period starting in late 2015 – Base-Level System: December 2015](https://reader033.fdocuments.in/reader033/viewer/2022050508/5f9990181d30c62fa51c8d85/html5/thumbnails/3.jpg)
3
Challenges and Software are Co-Evolving
VisualizationScientific
Visualization
Statistics
Calculations Calculations on Data
Optimization(numerical)
Structured Data
Machine Learning
Optimization(decision-making)
Natural
Processing
Natural Language Processing
Image Analysis
Video
Sound
Unstructured Unstructured Data
Graph Analytics
Information VisualizationInformation
Visualization
![Page 4: Converging HPC and Big Data to Enable New Areas …...– Construction planned to begin October, 2015 – Early User Period starting in late 2015 – Base-Level System: December 2015](https://reader033.fdocuments.in/reader033/viewer/2022050508/5f9990181d30c62fa51c8d85/html5/thumbnails/4.jpg)
4
The $9.65M Bridges acquisition is made possible by National Science Foundation (NSF) award #ACI-1445606:
Bridges: From Communities and Data to Workflows and Insight
is delivering Bridges
Disclaimer: The following presentation conveys the plan for Bridges. Certain details are subject to change.
![Page 5: Converging HPC and Big Data to Enable New Areas …...– Construction planned to begin October, 2015 – Early User Period starting in late 2015 – Base-Level System: December 2015](https://reader033.fdocuments.in/reader033/viewer/2022050508/5f9990181d30c62fa51c8d85/html5/thumbnails/5.jpg)
5
High-throughputgenome sequencers
An Important Addition to theNational Advanced Cyberinfrastructure Ecosystem
Bridges will be a new resource on XSEDE and will interoperate withother XSEDE resources, AdvancedCyberinfrastructure (ACI) projects,campuses, and instruments nationwide.
Data Infrastructure Building Blocks (DIBBs)‒ Data Exacell (DXC)‒ Integrating Geospatial Capabilities into HUBzero‒ Building a Scalable Infrastructure for Data-
Driven Discovery & Innovation in Education‒ Other DIBBs projectsOther ACI projects
Reconstructing brain circuits from high-resolution electron microscopy
Temple University’s new Science, Education, and Research Center
Carnegie Mellon University’s Gates Center for Computer Science
Examples:
Social networks and the Internet
![Page 6: Converging HPC and Big Data to Enable New Areas …...– Construction planned to begin October, 2015 – Early User Period starting in late 2015 – Base-Level System: December 2015](https://reader033.fdocuments.in/reader033/viewer/2022050508/5f9990181d30c62fa51c8d85/html5/thumbnails/6.jpg)
6
Data-intensive applications & workflows
Gateways – the power of HPC without the programmingShared data collections & related analysis tools
Cross-domain analytics
Graph analytics, machine learning, genome sequence assembly, and other large-memory applications
Scaling research questions beyond the laptop
Scaling research from individuals to teams and collaborations
Very large in-memory databasesOptimization & parameter sweeps
Distributed & service-oriented architectures
Data assimilation from large instruments and Internet data
Leveraging an extensive collection of interoperating software
Research areas that haven’t used HPC
Nontraditional HPC approaches to fields such as the physical sciences
Coupling applications in novel ways
Leveraging large memory and high island bandwidth
Motivating Use Cases(examples)
![Page 7: Converging HPC and Big Data to Enable New Areas …...– Construction planned to begin October, 2015 – Early User Period starting in late 2015 – Base-Level System: December 2015](https://reader033.fdocuments.in/reader033/viewer/2022050508/5f9990181d30c62fa51c8d85/html5/thumbnails/7.jpg)
7
Objectives and Approach• Bring HPC to nontraditional users and
research communities.
• Allow high-performance computing to beapplied effectively to big data.
• Bridge to campuses to streamline accessand provide cloud-like burst capability.
• Leveraging PSC’s expertise with sharedmemory, Bridges will feature 3 tiers oflarge, coherent shared-memory nodes:12TB, 3TB, and 128GB.
• Bridges implements a uniquely flexible environment featuring interactivity, gateways, databases, distributed (web) services, high-productivity programming languages and frameworks, and virtualization, and campus bridging.
EMBO Mol Med (2013) DOI: 10.1002/emmm.201202388: Proliferation of cancer-causing mutations throughout life
Alex Hauptmann et. al.: Efficient large-scalecontent-based multimedia event detection
![Page 8: Converging HPC and Big Data to Enable New Areas …...– Construction planned to begin October, 2015 – Early User Period starting in late 2015 – Base-Level System: December 2015](https://reader033.fdocuments.in/reader033/viewer/2022050508/5f9990181d30c62fa51c8d85/html5/thumbnails/8.jpg)
8
Interactivity
• Interactivity is the feature mostfrequently requested bynontraditional HPC communities.
• Interactivity provides immediatefeedback for doing exploratorydata analytics and testing hypotheses.
• Bridges offers interactivity through a combination of virtualization for lighter-weight applications and dedicated nodes for more demanding ones.
![Page 9: Converging HPC and Big Data to Enable New Areas …...– Construction planned to begin October, 2015 – Early User Period starting in late 2015 – Base-Level System: December 2015](https://reader033.fdocuments.in/reader033/viewer/2022050508/5f9990181d30c62fa51c8d85/html5/thumbnails/9.jpg)
9
Gateways and Tools for Building Them
Gateways provide easy-to-use access to Bridges’ HPC and data resources, allowing users to launch jobs, orchestrate complex workflows, and manage data from their browsers.
- Extensive leveraging of databases and polystore systems- Great attention to HCI is needed to get these right
Download sites for MEGA-6 (Molecular Evolutionary Genetic Analysis),from www.megasoftware.net
Interactive pipeline creation in GenePattern (Broad Institute)
Col*Fusion portal for the systematic accumulation, integration, and utilization
of historical data, from http://colfusion.exp.sis.pitt.edu/colfusion/
![Page 10: Converging HPC and Big Data to Enable New Areas …...– Construction planned to begin October, 2015 – Early User Period starting in late 2015 – Base-Level System: December 2015](https://reader033.fdocuments.in/reader033/viewer/2022050508/5f9990181d30c62fa51c8d85/html5/thumbnails/10.jpg)
10
Virtualization and Containers
• Virtual Machines (VMs) enable flexibility, security, customization, reproducibility, ease of use, and interoperability with other services.
• User demand is for custom database and web server installations to develop data-intensive, distributed applications and containers for custom software stacks and portability.
• Bridges leverages OpenStack to provision resources, between interactive, batch, Hadoop, and VM uses.
![Page 11: Converging HPC and Big Data to Enable New Areas …...– Construction planned to begin October, 2015 – Early User Period starting in late 2015 – Base-Level System: December 2015](https://reader033.fdocuments.in/reader033/viewer/2022050508/5f9990181d30c62fa51c8d85/html5/thumbnails/11.jpg)
11
High-Productivity Programming
Supporting languages that communities already use is vital for them to apply HPC to their research questions.
![Page 12: Converging HPC and Big Data to Enable New Areas …...– Construction planned to begin October, 2015 – Early User Period starting in late 2015 – Base-Level System: December 2015](https://reader033.fdocuments.in/reader033/viewer/2022050508/5f9990181d30c62fa51c8d85/html5/thumbnails/12.jpg)
12
Bridges’ large memory is great for Spark!
Bridges enables workflows thatintegrate Spark/Hadoop, HPC,and/or shared-memory components.
Spark, Hadoop & Related Approaches
![Page 13: Converging HPC and Big Data to Enable New Areas …...– Construction planned to begin October, 2015 – Early User Period starting in late 2015 – Base-Level System: December 2015](https://reader033.fdocuments.in/reader033/viewer/2022050508/5f9990181d30c62fa51c8d85/html5/thumbnails/13.jpg)
13
Campus Bridging
• Through a pilot project with Temple University, the Bridges project will explore new ways to transition data and computing seamlessly between campus and XSEDE resources.
• Federated identity management will allow users to use their local credentials for single sign-on to remote resources, facilitating data transfers between Bridges and Temple’s local storage systems.
• Burst offload will enable cloud-like offloading of jobs from Temple to Bridges and vice versa during periods of unusually heavy load.
Federated identity management
Burst offload
http://www.temple.edu/medicine/research/RESEARCH_TUSM/
![Page 14: Converging HPC and Big Data to Enable New Areas …...– Construction planned to begin October, 2015 – Early User Period starting in late 2015 – Base-Level System: December 2015](https://reader033.fdocuments.in/reader033/viewer/2022050508/5f9990181d30c62fa51c8d85/html5/thumbnails/14.jpg)
14© 2015 Pittsburgh Supercomputing Center
800 HPE Apollo 2000 (128GB) compute nodes
20 “leaf” Intel® OPA edge switches
6 “core” Intel® OPA edge switches:fully interconnected,2 links per switch
42 HPE ProLiant DL580 (3TB) compute nodes
20 Storage Building Blocks, implementing the parallel Pylonfilesystem (~10PB) using PSC’s SLASH2 filesystem
4 HPE Integrity Superdome X (12TB)
compute nodes
12 HPE ProLiant DL380 database nodes
6 HPE ProLiant DL360 web server nodes
4 MDS nodes2 front-end nodes
2 boot nodes8 management nodes
Intel® OPA cables
Purpose-built Intel® Omni-Path topology for data-intensive HPC
http://psc.edu/bridges
16 RSM nodes with NVIDIA K80 GPUs
32 RSM nodes with NVIDIAnext-generation GPUs
http://staff.psc.edu/nystrom/bvt
![Page 15: Converging HPC and Big Data to Enable New Areas …...– Construction planned to begin October, 2015 – Early User Period starting in late 2015 – Base-Level System: December 2015](https://reader033.fdocuments.in/reader033/viewer/2022050508/5f9990181d30c62fa51c8d85/html5/thumbnails/15.jpg)
15
Type RAMa Ph n CPU / GPU / other Server
ESM 12TB1 2 16 × Intel Xeon E7-8880 v3 (18c, 2.3/3.1 GHz, 45MB LLC) HPE Integrity
Superdome X2 2 16 × TBA
LSM 3TB1 8 4 × Intel Xeon E5-8860 v3 (16c, 2.2/3.2 GHz, 40 MB LLC) HPE ProLiant
DL5802 34 4 × TBA
RSM 128GB 1 752 2 × Intel Xeon E5-2695 v3 (14c, 2.3/3.3 GHz, 35MB LLC)HPE Apollo 2000RSM-
GPU128GB
1 16 2 × Intel Xeon E5-2695 v3 + 2 × NVIDIA K80
2 32 2 × Intel Xeon E5-2695 v3 + 2 × NVIDIA next-generation GPU
DB-s128GB 1
6 2 × Intel Xeon E5-2695 v3 + SSDHPE ProLiant DL360
DB-h 6 2 × Intel Xeon E5-2695 v3 + HDDsHPE ProLiant DL380
Web 128GB 1 6 2 × Intel Xeon E5-2695 v3HPE ProLiant DL360
Otherb 128GB 1 14 2 × Intel Xeon E5-2695 v3 HPE ProLiant DL360, DL380
Total
a. All RAM in these nodes is DDR4-2133b. Other nodes = front end (2) + management/log (8) + boot (2) + MDS (4)
Node Types*
* not including Pylon storage servers
![Page 16: Converging HPC and Big Data to Enable New Areas …...– Construction planned to begin October, 2015 – Early User Period starting in late 2015 – Base-Level System: December 2015](https://reader033.fdocuments.in/reader033/viewer/2022050508/5f9990181d30c62fa51c8d85/html5/thumbnails/16.jpg)
16
Database and Web Server Nodes• Dedicated database nodes will power persistent
relational and NoSQL databases– Support data management and data-driven workflows– SSDs for high IOPs; RAIDed HDDs for high capacity
• Dedicated web server nodes– Enable distributed, service-oriented architectures– High-bandwidth connections to XSEDE and the Internet
(examples)
![Page 17: Converging HPC and Big Data to Enable New Areas …...– Construction planned to begin October, 2015 – Early User Period starting in late 2015 – Base-Level System: December 2015](https://reader033.fdocuments.in/reader033/viewer/2022050508/5f9990181d30c62fa51c8d85/html5/thumbnails/17.jpg)
17
Data Management
• Pylon: A large, central, high-performance filesystem– Visible to all nodes– Large datasets, community repositories (10 PB usable)
• Distributed (node-local) storage– Enhance application portability– Improve overall system performance– Improve performance consistency to the shared filesystem
• Acceleration for Hadoop-based applications
![Page 18: Converging HPC and Big Data to Enable New Areas …...– Construction planned to begin October, 2015 – Early User Period starting in late 2015 – Base-Level System: December 2015](https://reader033.fdocuments.in/reader033/viewer/2022050508/5f9990181d30c62fa51c8d85/html5/thumbnails/18.jpg)
18
Intel® Omni-Path Architecture (OPA)
• Bridges is the first production deployment of Omni-Path• Omni-Path connects all nodes and the shared filesystem, providing
Bridges and its users with:– 100 Gbps line speed per port;
25 GB/s bidirectional bandwidth per port– Observed <0.93μs latency, 12.36 GB/s/dir– 160M MPI messages per second– 48-port edge switch reduces
interconnect complexity and cost– HPC performance, reliability, and QoS– OFA-compliant applications supported without modification– Early access to this new, important, forward-looking technology
• Bridges deploys OPA in a two-tier island topology developed by PSC for cost-effective, data-intensive HPC
![Page 19: Converging HPC and Big Data to Enable New Areas …...– Construction planned to begin October, 2015 – Early User Period starting in late 2015 – Base-Level System: December 2015](https://reader033.fdocuments.in/reader033/viewer/2022050508/5f9990181d30c62fa51c8d85/html5/thumbnails/19.jpg)
19
Example: Causal Discovery PortalCenter for Causal Discovery, an NIH Big Data to Knowledge Center of Excellence
Web node
VMApache Tomcat
Messaging
Database node
VM
Other DBs
LSM Node (3TB)
ESM Node(12TB)
Analytics:FGS and other algorithms,
building on TETRAD
Browser-based UI
• Authentication• Data• Provenance
Execute causal discovery algorithms
Omni-Path
• Prepare and upload data
• Run causal discovery algorithms
• Visualize results
Internet
Memory-resident datasets
Pylon filesystem
TCGAfMRI
…
![Page 20: Converging HPC and Big Data to Enable New Areas …...– Construction planned to begin October, 2015 – Early User Period starting in late 2015 – Base-Level System: December 2015](https://reader033.fdocuments.in/reader033/viewer/2022050508/5f9990181d30c62fa51c8d85/html5/thumbnails/20.jpg)
20
Gaining Access to Bridges
Bridges is allocated through XSEDE: https://www.xsede.org/allocations– Starter Allocation
• Can request anytime• Up to 50,000 core-hours on RSM and GPU (128GB) nodes
and/or 10,000 TB-hours on LSM (3TB) and ESM (12TB) nodes• Can request XSEDE ECSS (Extended Collaborative Support Service
– Research Allocation (XRAC)• Appropriate for larger requests; can request ECSS• Can be up to millions to tens of millions of SUs• Quarterly submission windows: March 15–April 15, June 15–July 15, etc.
– Coursework Allocations• To support use of Bridges for relevant courses
– Community Allocations• Primarily to support gateways
Up to 10% of Bridges’ SUs are available on a discretionary basis to industrial affiliates, Pennsylvania-based researchers, and others to foster discovery and innovation and broaden participation in data-intensive computing.
XSEDE calls these “Service Units” (SUs)
![Page 21: Converging HPC and Big Data to Enable New Areas …...– Construction planned to begin October, 2015 – Early User Period starting in late 2015 – Base-Level System: December 2015](https://reader033.fdocuments.in/reader033/viewer/2022050508/5f9990181d30c62fa51c8d85/html5/thumbnails/21.jpg)
21
Bridges Target Schedule
• Acquisition– Construction planned to begin October, 2015– Early User Period starting in late 2015– Base-Level System: December 2015 – January 2016– Phase 1 installed and in testing– Phase 2 Technical Update: summer 2016
• XRAC-awarded projects and Startups– Bridges is currently in its Early User Period– All allocated projects have access
![Page 22: Converging HPC and Big Data to Enable New Areas …...– Construction planned to begin October, 2015 – Early User Period starting in late 2015 – Base-Level System: December 2015](https://reader033.fdocuments.in/reader033/viewer/2022050508/5f9990181d30c62fa51c8d85/html5/thumbnails/22.jpg)
22
For Additional Information
Project website: www.psc.edu/bridges
Questions: [email protected]
Bridges PI: Nick [email protected]
Co-PIs: Michael J. LevineRalph RoskiesJ Ray Scott
Project Manager: Robin Scibek