E ducational Applications of Supercomputing and Cyberinfrastructure KAUST Economic Development...
-
Upload
adam-hopkins -
Category
Documents
-
view
215 -
download
0
Transcript of E ducational Applications of Supercomputing and Cyberinfrastructure KAUST Economic Development...
Educational Applications of Supercomputing and Cyberinfrastructure
KAUST Economic Development International Symposium at ISC'11, 21 June 2011, Hamburg
Supercomputing in Science and Engineering:Economic and Technological Opportunities and Challenges
Dr. Craig A. StewartAssociate Dean, Research Technologies
Executive Director, Pervasive Technology InstituteIndiana [email protected]
2
Outline
• Too many people, too few people• Inspiring young people• Examples of interesting educational activities
(roughly scaling up by participant count)• New opportunities for cyberinfrastructure at the
campus, national, and international levels (campus bridging)
• Conclusions: Education, technology, economic development
NB: License terms for slides at end
3
Some definitions• Supercomputer – large, monolithic, tightly integrated computer • High Performance Computer – a more general term than
supercomputer including a wider variety of cluster types • High Throughput Computing – systems of computers that work on
nicely parallel problems with (very) low bandwidth connections• Cyberinfrastructure consists of computing systems, data storage
systems, advanced instruments and data repositories, visualization environments, and people, all linked together by software and high performance networks to improve research productivity and enable breakthroughs not otherwise possible.
• eScience – large scale science increasingly carried out by global collaborations enabled by the Internet.
4
Technology assertions…“…results of this discovery upon society will be greater than the imagination of the most sanguine can now distinctly conceive.”
“… will tremendously influence our national elections, will promote world understanding of social, racial, and economic problems, will influence our daily lives to a degree yet undreamed of.”
“… is becoming the town square for the global village of tomorrow.”
“The world is poised on the cusp of an economic and cultural shift as dramatic as that of the Industrial Revolution.”
“We have technology, finally, that for the first time in human history allows people to really maintain rich connections with much larger numbers of people.”
The telegraph
The television
The Internet
The WWW
The Internet
The American Biblical Repository, 1838
Franklin Dunham, 1956
Bill Gates, 1996
Steven Levy, 1997
Pierre Omidyar, 2005
5
A.D.2000
A.D.1000
A.D.1
1000B.C.
2000B.C.
3000B.C.
4000B.C.
5000B.C.
6000B.C.
7000B.C.
1+ million years
8
7
6
5
2
1
4
3
OldStoneAge New Stone Age
BronzeAge
IronAge
MiddleAges
ModernAge
Black Death — The Plague
9
10
11
12
A.D.3000
A.D.4000
A.D.5000
18001900
1950
1975
2000
2100
Future
Billions
Source: © Population Reference Bureau; and United Nations, World Population Projections to 2100 (1998).
World population growth (history & predicted)
6
© Schnabel, R. 2011. ACM’s engagement in education policy.CRA Leadership Meeting. 28 Feb 2011.
7
Analytics market $76B market by 2015
Information & Analytics Market$60B in 2011; 6.4% CGR 10′ - 15′
Data Mgmt & IDM$18.9B 11; ′4.2% CGR 10- 15′ ′
Content Management$6.9B 11; ′6.7% CGR 10- 15′ ′
Info Integration & MDM$4.9B 10; ′8.3% CGR 10-’15′
Analytic Applications$7.3B 11; ′7.0% CGR 10- 15′ ′
DW DBMS$6.9B 11; ′7.1% CGR 10- 15′ ′
Source: GMV 2H10 (incl. analytic applications)
BI Platform & PM$15.0B 11; ′7.7% CGR 10- 15′ ′
SPSS
© IBM, Inc.
8
The conundrum
• Technology will not solve our problems by itself• We do not have enough knowledge workers• People in many parts of the globe do not have
access to education that will enable them to fill the jobs of today and tomorrow
• Colleges and universities are not recruiting and retaining enough students to fulfill demand for students with CSTEM skills in general and advanced computing skills in particular
9
We need people comfortable with critical thinking and computational thinking
• Critical thinking skills • Computational thinking skills– Conceptualizing, not programming– Fundamental, not rote skill– A way that humans, not computers, think– Complements and combines mathematical and
engineering thinking– Ideas, not artifacts– For everyone, everywhere
From: Wing, J.M. Computational thinking. 2006. Communications of the ACM. 49(3): 33-35
10
Inspiration matters!
© Estes-COX Inc. www.estesrockets.com
Two young model and model rocket builders, shortly after claiming world record for continuous model building at 37 hours and 40 minutes (quickly surpassed by others) April, 1973
11
Ready, Set Robots! Camp @ PTI 2010
Mike Boyles, AVL, Research Technologies, UITS / PTI giving demo
From 3D movie What is Cancer? by Albert WilliamIUPUI, SOIC, AVL, Research Technologies, UITS / PTI
© Matthew King, student in IU professorMargaret Dolinsky's Digital Art class
12
Games are not reality
PolarGrid
Photos courtesy of Keith Lehigh and Matt Link, Indiana UniversityGeoffrey Fox, PI. PolarGrid 13
Je’aime Powell, Elizabeth City State University graduate researcher on Greenland expedition, 2009.
14
SC ‘08 Cluster Challenge
IU / Dresden team – organized within IU side by Dr. Andrew Lumsdaine, Director, Open Systems Lab and Center for Scalable Computing, PTI, and Professor, School of Informatics and Computing; and Matt Link and D. Scott McCaulay, Directors, Research Technologies, UITS / PTI
15
Guitar workshop
Photos courtesy of Rebecca Lowe, Open Systems Lab, SOIC and PTI, Indiana University. Guitar workshop sponsored by Dr. Andrew Lumsdaine, Director, Open Systems Lab and Center for Scalable Computing, PTI; and Professor, School of Informatics and Computing
16
Minority Engineering Advancement Program @ IUPUI
Use the Bootable Cluster CD with the “Game of Life” to demonstrate speedup
LittleFe - small integrated cluster
Matt Link, Director, Research Technologies,UITS; and Associate Director, Center forScalable Computing, PTI
17
LittleFe“LittleFe is a complete 6 node Beowulf style portable computational cluster. The entire package weighs less than 50 pounds; easily travels; and sets-up in 5 minutes. Current generation LittleFe hardware includes multi-core processors and GPGPU support enabling support for shared memory parallelism, distributed memory parallelism, and hybrid models. By leveraging the Bootable Cluster CD project, and the Computational Science Education Reference Desk LittleFe is a powerful, ready-to-run, computational science and parallel programming educational platform for the price of a high-end laptop.”http://LittleFe.net
Photo courtesy Charlie Peck, Earlham College. © Earlham College.
18Images © Beth Plale, Professor, School of Informatics & Computing; Director, Data to Insight Center, PTI
LEAD (Linked Environments for Atmospheric Discovery) & LEAD II – an example Science Gateway
Meteorology researchers used data and images generated by LEAD II while chasing tornadoes.
19
WxChallenge & LEAD II
www.wxchallenge.com. Screen image © University of Oklahoma.
In support of the 2010 Vortex2 campaign, LEAD II successfully executed 214 workflows, used 109,568 CPU hours, generated 215 GB of data and over 9,100 2D products.
http://pti.iu.edu/d2i/leadii-vortex2 Image © Trustees of Indiana University
20
nanoHUB
Screen Image © Network for Computational Nanotechnology (nanohub.org/groups/ncn).
21
nanoHUB usage
nanoHUB usage, September 2010. Red dots: tutorial and seminar use. Yellow dots: online simulation use. Size of dot indicates number of users from location. Annually nanoHUB serves over 170,000 users in 172 countries.
© Gerhard Klimeck, Network for Computational Nanotechnology (nanohub.org/groups/ncn). Used by permission. May not be reused without permission.
22
@home projects (based on BOINC)
docking.cis.udel.edu Image Courtesy of Michela Taufer, GCLab, U. Delaware. © U. Deleware
http://escatter11.fullerton.edu/nfs/. Image courtesy Dr. Greg Childers, and © California State University, Fullerton.
23
You don’t need access to a supercomputer to teach parallel computing… or data-intensive computing
• Multicore & GPUs• LittleFe• Cloud providers• Citizen Science – access to
and participation in authentic science
Photograph © Chris Eller, Advanced VisualizationLab, Research Technologies, UITS; and PTI
physicsworld.com/cws/article/news/2738© Institute of Physics. Reused underLicensing terms @ physicsworld.com/cws/copyright
24
Campus bridging• Campus bridging is the seamlessly integrated use of
cyberinfrastructure operated by a scientist or engineer with other cyberinfrastructure on the scientist’s campus, at other campuses, and at the regional, national, and international levels as if they were proximate to the scientist, and when working within the context of a Virtual Organization (VO) make the ‘virtual’ aspect of the organization irrelevant (or helpful) to the work of the VO.
• Campus bridging material: http://pti.iu.edu/campusbridging/
• ACCI Taskforce final reports: http://www.nsf.gov/od/oci/taskforces/
Commercial cloud (Iaas and Paas)
Volunteer computing
Workstations at Carnegie research
universities
Campus HPC/ Tier 3 systems
Track 2 and other major facilities
NSF Track 1
0 2000 4000 6000 8000 10000 12000
Estimated Computing Capacity (TFLOPS)
Data at http://hdl.handle.net/2022/13136
TFLOPS
25
26
Single lab biological instrumentsType of instrument Model Raw image
dataData products
Light Microscopy BD Pathway 855 Bioimager N/A 7 GB/day
Genome sequencing
Roche 454 Life Sciences genome analyzer system
39 GB/day 9 GB/day
Illumina-Solexa genome analyzer system
367 GB/day 100 GB/day
ABI SOLID 3 238 GB/day 150 GB/day
Microarray Gene Expression Chip Reader
Molecular Devices GenePix Professional 4200A Scanner
N/A 8 MB/day
Microarray Gene Expression Chip Reader
NimbleGen Hybridization System 4 (110V)
N/A 300 MB/day
Several Task Force recommendations to the NSF re Hardware and networking: Much more attention to data and networking challenges!
27
Cyberinfrastructure is infrastructure
Strategic Recommendation to the NSF: NSF must lead the community in establishing a blueprint for a National CI
CI software must be mademore robust
National Science Foundation. Investing in America’s Future: Strategic Plan FY 2006-2011. September 2006. Available from: http://www.nsf.gov/pubs/2006/nsf0648/nsf0648.jsp
28
Examples of mature and maturing systems & software
DEISA
UK eScience Grid
NSF CIF 21 (CyberinfrastructureFramework for 21st Century Science and Engineering
ROCKS (www.rocksclusters.org)
Condor (www.condor.org)
© DEISA. http://www.deisa.eu/usersupport/user-documentation/unicore-5-in-deisa/job-submission-through-unicore-5/DEISA-UNICORE-Figure01.png/image_preview
29
Critical challenge: curriculum materials
http://ocw.mit.edu/index.htm Used under Creative Commons License – Attribution-NonCommercial-ShareAlike 3.0 United States (CC BY-NC-SA 3.0) http://creativecommons.org/licenses/by-nc-sa/3.0/us/
30
Existing curriculum resources• MIT Computer Science & Engineering curriculum –
web.mit.edu/catalog/degre.engin.ch6.html• ACM – www.acm.org/education/curricula-recommendations• TCPP (Technical Committee on Parallel Programming) tcpp.cs.gsu.edu/
– CORE COURSES: • CS1 Introduction to Computer Programming (First Courses) • CS2 Second Programming Course in the Introductory Sequence • Systems Intro Systems/Architecture Core Course • DS/A Data Structures and Algorithms • DM Discrete Structures/Math ADVANCED
– ELECTIVE COURSES: • Arch 2 Advanced Elective Course on Architecture• Algo 2 Elective/Advanced Algorithm Design and Analysis• Lang Programming Language/Principles (after introductory sequence) • SwEngg Software Engineering • ParAlgo Parallel Algorithms • ParProg Parallel Programming • Compilers Compiler Design
– IMHO: The TCPP curriculum demonstrates the need for more attention to computational thinking in K-12 education
University ofArkansas
Indiana University
University ofCalifornia atLos Angeles
Penn State
Iowa
Univ.Illinois at Chicago
University ofMinnesota Michigan
State
NotreDame
University of Texas at El Paso
IBM AlmadenResearch Center
WashingtonUniversity
San DiegoSupercomputerCenter
Universityof Florida
Johns Hopkins
July 26-30, 2010 NCSA Summer School Workshophttp://salsahpc.indiana.edu/tutorial
300+ Students learning about Twister & Hadoop MapReduce technologies, supported by FutureGrid.
Slide © Judy Qiu, SOIC and SALSA Lab, PTI
32
Economies of scale in training
Image from TeraGridEOT: Education, Outreach, and Training 2010. https://www.teragrid.org/web/news/news#2010scihigh
Photo courtesy Robert Quick,Research Technologies & PTI.OSG Grid School in Sao Paulo Brazil, January 2011
33
Great challenges, great opportunities
• Challenges– Matters such as human impact on the global environment will
be most successfully addressed with fact-based consensus approaches.
– More countries must have the skill and access to technology to do their own modeling
• Cyberinfrastructure and education opportunities – If we can treat cyberinfrastructure more like infrastructure …
we can focus on the challenging / important / fun work– Robust cyberinfrastructure => reusable educational materials– Data-intensive science creates tremendous need and
opportunity in education and application– While we are busy improving the pipeline of talent, involving
undergrads in research may greatly improve the % of the existing pipeline that pursues an advanced technology career
34
New economic growth opportunities
• VOs and opportunities they provide for research• Digital manufacturing (new opportunities in a
different approach to globalization)• Sustainable societies• With better education in supercomputing, and all
forms of high performance computing, people may enable us to achieve some of the technology nirvana described at beginning of talk
35
This talk is dedicated to the memory of Truman O. Stewart
36
Additional information• Droegemeier, K., B. Plale, M. Ramamurthy, and C. Mattocks, "A New
Approach for Using Web Services, Grids, and Virtual Organizations in Mesoscale Meteorological Research" 25th Conference on Interactive Information Processing Systems for Meteorology, Oceanography, and Hydrology (IIPS), 01/2009.
• Stewart, C.A., S. Simms, B. Plale, M. Link, D. Hancock and G. Fox. 2010. What is Cyberinfrastructure? In: Proceedings of SIGUCCS 2010 (Norfolk, VA, 24-27 Oct, 2010). http://portal.acm.org/citation.cfm?doid=1878335.1878347
• http://www.computinginthecore.org/ − “a non-partisan advocacy coalition … to elevate computer science education to a core academic subject in K-12 education …
• http://hubzero.org/resources/408/ Exploring the Impact of nanoHUB.org on Research and Education
• Cohen, D. 2006. Globalization and Its Enemies. MIT Press
37
Acknowledgments• Thanks to King Abdullah University of Science and Technology for
the opportunity to present today (Through Inspiration, Discovery indeed!)
• Malinda Lingwall for editing, graphical work, and fact-finding/checking
• Ready, Set, Robots! Camp: Daphne Siefert-Herron, Kurt Seiffert, Kristy Kallback-Rose, Danko Antolovic, Jenett Tillotson, Therese Miller
• MEAP: David Hancock, Andrew Arenson, Rich Knepper, Kurt Seiffert, Matt Link (Research Technologies, UITS, Research Technologies, PTI); Patrick Gee, Mark Russell
• Thanks to all of the IU Research Technologies staff and Pervasive Technology Institute students, staff, and faculty who have led or been involved in the IU projects described here
38
Acknowledgments• Many of the scientific workflow examples here use the IU Data Capacitor – project led by
Steve Simms, Research Technologies, UITS, & PTI. http://pti.iu.edu/dc/ NSF CNS 05-21433• LEAD: Beth Plale, IU (SOIC-PTI) funded by NSF 0331480• PolarGrid: NSF 0723054 (G. Fox, PI)• FutureGrid: NSF 0910812 (G. Fox, PI)• nanoHUB: nanoHUB.org is operated by Network for Computational Nanotechnology (NCN).
NCN was funded by the National Science Foundation (NSF) under various grants. • Development and support of nanoHUB is also supported in part by the HUBzero consortium,
of which IU is a member. • Campus Bridging: NSF 040777, 1059812, 0948142, 1002526, 0829462• LittleFe: Support from TeraGrid, SC Conference, Intel Corporation and Earlham College.• Lilly Endowment for its support of IU through INGEN, METACyt, and the Pervasive
Technology Institute• Tevfik Kosar, who as Chair of DIDC ‘10 invited me to present the Keynote presentation at the
Third International Workshop on Data Intensive Distributed Computing (DIDC'10). “It’s not a data deluge – it’s worse than that.” Several slides from that talk are reused here. That original talk is available : http://hdl.handle.net/2022/13195
• Thanks to those individuals who gave permission to use images presented in this talk• Any opinions presented here are those of the presenter and do not necessarily represent the
opinions of the National Science Foundation, the Lilly Endowment, the NSF ACCI, NSF ACCI Task Force on Campus Bridging, or any other funding agencies or organizations
39
License terms• Items indicated with a © are under copyright and used here with permission. Such
items may not be reused without permission from the holder of copyright except where license terms noted on a slide permit reuse.
• Please cite this presentation as: Stewart, C.A. “Educational Applications of Supercomputing and Cyberinfrastructure.” Presentation at KAUST Economic Development International Symposium at ISC'11, 21 June 2011. Available from: http://hdl.handle.net/2022/13365
• Except where otherwise noted, contents of this presentation are copyright 2011 by the Trustees of Indiana University.
• This document is released under the Creative Commons Attribution 3.0 Unported license (http://creativecommons.org/licenses/by/3.0/). This license includes the following terms: You are free to share – to copy, distribute and transmit the work and to remix – to adapt the work under the following conditions: attribution – you must attribute the work in the manner specified by the author or licensor (but not in any way that suggests that they endorse you or your use of the work). For any reuse or distribution, you must make clear to others the license terms of this work.
40
Questions?
And thank you for your kind attention….