Surviving in the Wild : Teaching and Training for the Parallel Future
description
Transcript of Surviving in the Wild : Teaching and Training for the Parallel Future
csinparallel.org
Surviving in the Wild:Teaching and Training for the
Parallel Future
Dick BrownSt. Olaf College
SPLASH Educators’ and Trainers SymposiumOctober 24, 2011
csinparallel.org
Overview
• Review of the need for parallelism
• Strategies we need– What to teach– How to teach it– How to get it taught
• Surviving in the wild of Parallelism
With interludes, to be announced…
csinparallel.org
The need for parallelism
• Question: Why do we need parallelism at the programming level? – Hint: It’s not “because it’s there”
(… although desktop applications do tend to find ways to use ever greater power per dollar)
Note: I will use parallel as a generic for concurrent, parallel, distributed, cloud, accelerator (e.g., GPGPU), etc.
csinparallel.org
The need for parallelism
• Answer: Scale
csinparallel.org
The need for parallelism
• Answer: Scale
• Cloud applications
+
= 30,000,000,000,000,000
csinparallel.org
The need for parallelism
• Answer: Scale
• Cloud applications• Scientific applications: Particle-level
simulation of turbulance is exascale
• Can’t achieve exascale performance withoutmany cores (Berkeley “walls”), accelerators
csinparallel.org
Challenges for industry
• Technology:– Heterogeneous computing (CPU + accelerators)– Sophisticated “on the fly” runtime systems– “Wall” of memory hierarchy vs. on-chip access
csinparallel.org
Challenges for industry
• Technology:– Heterogeneous computing (CPU + accelerators)– Sophisticated “on the fly” runtime systems– “Wall” of memory hierarchy vs. on-chip access
• Examples– AMD Fusion System Architecture: CPU+GPU– Intel MIC (Many Integrated Cores):
50+ CPUs on a chip, as a cluster-like accelerator
csinparallel.org
Challenges for industry
• Technology:– Heterogeneous computing (CPU + accelerators)– Sophisticated “on the fly” runtime systems– “Wall” of memory hierarchy vs. on-chip access
• Programming models:– Higher level; more “human-centric”– Scalable– Versatile
csinparallel.org
Challenges for Education/Training
We want to prepare our students for whatthey’ll need, before the demand explodes, but
• What are the enduring principles?• Technologies, (hence) tools change rapidly!• (Educators:) Change the curriculum???
csinparallel.org
A wild ecosystem
• Industry/Academia
csinparallel.org
A wild ecosystem
• Industry/Academia• Learning curve/Rapid change
csinparallel.org
A wild ecosystem
• Industry/Academia• Learning curve/Rapid change• Principles/Practices
csinparallel.org
A wild ecosystem
• Industry/Academia• Learning curve/Rapid change• Principles/Practices• Teaching/Research
– New research discoveries in technology and programming models need to get into the curriculum yesterday
csinparallel.org
A wild ecosystem
• Industry/Academia• Learning curve/Rapid change• Principles/Practices• Teaching/Research
We need strategy!
And, it’s coming fast! – Took OOPSLA 20 years to become SPLASH…
We can’t wait that long
csinparallel.org
Strategies to be found
• What to teach• How to teach it• How to get it taught
ITiCSE 2010 working group, Strategies for Preparing Computer Science Students for the Multicore World
csinparallel.org
What to teach
• Parallel computing has a head start: ACM/IEEE Curriculum ’91– 3 required hours on parallel algorithms– 3 required hours on distributed and parallel
programming language constructs, with hands-on practice
• Ada, Concurrent Pascal, Occam, or Parlog
(Was not universally embraced…)
csinparallel.org
What to teach
• Parallel computing has a head start: ACM/IEEE Curriculum ’91– 3 required hours on parallel algorithms– 3 required hours on distributed and parallel programming
language constructs, with hands-on practice
But, ten years later…• ACM/IEEE Curriculum ’01
– 0 required hours of parallel algorithms– No mention of programming language constructs– Replaced by: “net-centric computing,” etc.
NSF/TCPP Curriculum Standards Initiative in Parallel and Distributed Computing –
Core Topics for Undergraduates
Sushil K. Prasad, IEEE TCPP Chair, Georgia State UniversityRichard LeBlanc, Seattle University, ACM Education Council
Charles Weems, University of Massachusetts, AmherstAlan Sussman, University of Maryland
Arnold Rosenberg, Northeastern and Colorado State UniversityAndrew Lumsdaine, Indiana University
Curriculum Initiative Website: linked through tcpp.computer.orghttp://www.cs.gsu.edu/~tcpp/curriculum/index.php
Who are we?• Chtchelkanova, Almadena - NSF• Dehne, Frank - University of Carleton,
Canada• Gouda, Mohamed - University of Texas,
Austin, NSF• Gupta, Anshul - lBM T.J. Watson
Research Center• JaJa, Joseph - University of Maryland• Kant, Krishna - NSF, Intel• La Salle, Anita - NSF • LeBlanc, Richard, University of Seattle• Lumsdaine, Andrew - Indiana
University• Padua, David- University of Illinois at
Urbana-Champaign• Parashar, Manish- Rutgers, NSF
• Prasad, Sushil- Georgia State University• Prasanna, Viktor- University of
Southern California• Robert, Yves- INRIA, France• Rosenberg, Arnold- Colorado State
University• Sahni, Sartaj- University of Florida• Shirazi, Behrooz- Washington State
University• Sussman, Alan - University of Maryland• Weems, Chip, University of
Massachussets• Wu, Jie - Temple University
Specifying Curriculum Recommendations – NSF/TCPP Approach
• Identify topics in four existing areas: architecture, algorithms, programming, and cross-cutting topics
• For each topic, recommend– Bloom level– “Hours” of coverage– Suggested learning outcome– Possible core course for coverage
• Focus: First two years
Bloom Levels
Use first three levels for recommended core topics• K= Know the term/recall definition (basic literacy)• C = Comprehend so as to paraphrase/illustrate• A = Apply it in some way (requires operational
command)
• N = Not in core (but may be useful in elective or advanced courses)
Example
• Parallel and Distributed Models and Complexity– Costs of computation
Algorithms TopicsBloom# Course Learning Outcome
Algorithmic problemsThe important thing here is to emphasize the parallel/distributed aspects of the topic
Communication
broadcast C/AData Struc/Algo
represents method of exchanging information - one-to-all broadcast (by recursive doubling)
multicast K/CData Struc/Algo
Illustrate macro-communications on rings, 2D-grids and trees
scatter/gather C/A Data Structures/Algorithmsgossip N Not in core
Asynchrony K CS2asynchrony as exhibited on a distributed platform, existence of race conditions
Synchronization KCS2, Data Struc/Algo
aware of methods of controlling race condition,
Sorting C CS2, Data Struc/Algo parallel merge sort,
Selection KCS2, Data Struc/Algo
min/max, know that selection can be accomplished by sorting
K: know termC: paraphrase/illustrateA: apply
Programming• Assume some conventional (sequential)
programming experience• Key is to introduce parallel programming early
to students• Four overall areas
– Paradigms – By target machine model and by control statements
– Notations – language/library constructs– Correctness – concurrency control– Performance – for different machine classes
Parallel Programming Paradigms (Selections)
• By target machine model– Shared memory (Bloom classification A)– Distributed memory (C)– Client/server (C)– Hybrid (K) – e.g., CUDA for CPU/GPU
• By control statements– Task/thread spawning (A)– Parallel Loop (C)
How to Read the Proposal
• Oh no! Not another class to squeeze into our curriculum!
• Oh yes! Not another class to squeeze into your curriculum!
How to Read the Proposal
• Oh yes! Not another class to squeeze into your curriculum!
• Draft curriculum released Dec 2010 (tcpp.computer.org)
How to Read the Proposal
csinparallel.org
Enduring skills?
• Since the tool set is subject to change at any time, how much investment in those skills?– Many parallel languages and features have come
and gone
• Need hands-on experience for effective learning.
• Anything may suddenly emerge as important– Python as a prototyping language for HPC
csinparallel.org
A candidate for addition
Patterns of parallel programming
csinparallel.org
Patterns, a candidate for addition
• Background– “Gang of Four” book
csinparallel.org
Patterns, a candidate for addition
• Background– “Gang of Four” book, 1994– Doug Lea, Concurrent programming in Java:
Design principles and patterns, 1999– Tim Mattson, et al, Patterns for Parallel
Programming, 2005– Kurt Kreutzer and Berkeley Parlab, the Dwarves
Motifs– Kreutzer and Mattson, OPL
(parlab.eecs.berkeley.edu/wiki/patterns)
csinparallel.org
Patterns, a candidate for addition
Why patterns?• They capture reusable units of expert
problem-solving strategy• Thus, they provide novices with a way to
acquire expertise• Many are supported by tools
– Loop parallel, Message passing, Map-reduce, …
csinparallel.org
How to teach it
• Agree with NSF/TCPP Initiative, that parallelism should be taught early and often– Scratch team kept concurrent scripts, because
users “not surprised that a sprite can do several things at once”
– Lessons of Vishkin’s “Peanut Butter Sandwich” exercise
csinparallel.org
CSinParallel project
• Add parallelism early and often at all levels• Incremental, flexible approach via modules• Sharing within our community
csinparallel.org
CSinParallel project
• Modular Approach– Short units (1-3 days)– Identified learning objectives– Self-contained– Flexible for use in various courses and curricula
• Make software/libraries more accessible– Parallel Platform Packages, Resources
• Share, discuss and help as a community– http://csinparallel.org
csinparallel.org
CSinParallel project
Some selected module topics• Introductory:
– Map-Reduce computing for CS1 using WebMapReduce– Concurrent access to data structures in Java or C++– Multicore programming with Intel’s Manycore Testing
Lab• Intermediate:
– Introduction to parallel computing concepts– Concurrency strategies in programming languages– Parallel sorting algorithms
csinparallel.org
Module: WebMapReduce
csinparallel.org
Module: MTL with OpenMP
• Intel’s Manycore Testing Lab
• Module– #pragma omp parallel for num_threads(threadct) \
shared (a, n, h, integral) private(i)
– reduction(+: integral)
csinparallel.org
CSinParallel
• We seek collaborators and contributors
csinparallel.org
Patterns Methodology
• Kreutzer and Mattson OPL not only provides a catalog of patterns, but also a software problem-solving methodology
csinparallel.org
Patterns Methodology
• Kreutzer and Mattson OPL not only provides a catalog of patterns, but also a software problem-solving methodology
• Purposes:– Education– Communication– Design
csinparallel.org
How to get it taught
• Pressures on the professor– “Oh no! Not another course to squeeze…”
• So, take an incremental spiral approach(agreeing with NSF/TCPP)– Small changes in curriculum in many places– Revisit challenging issues– Students come to think of parallelism as natural
part of computation– Spiral approach is pedagogically effective
csinparallel.org
How to get it taught
Incentives• Microgrants: small (e.g., $1500) amounts for
contributing first steps in teaching parallelism– Intel Academic Community
(intel.com/AcademicCommunity)– Educational Alliance for a Parallel Future
(eapf.org)
NSF/TCPP InitiativeEarly Adopter Program
How to obtain Early Adopter Status?• 16 Early adopters chosen for Spring term 2011 • 17 Early adopters chosen for Fall term 2011
• Next round of competition:
Fall 2012; Deadline November 5, 2011– NSF/Intel funded Stipend/Honorarium– Which course(s) , topics, evaluation plan?
How to obtain Early Adopter Status?• Instructors for
– core CS/CS courses such as CS1/2, Systems, Data Structures and Algorithms – department-wide adoption preferred
– elective courses such as Algorithms, Architecture, Programming Languages, Software Engg, etc.
– introductory/advanced PDC course– dept chairs, dept curriculum committee members
responsible
csinparallel.org
How to get it taught
Other supports needed• Platform availability• Support community• Educational elements
– Learning objectives, assessment tools, etc.
csinparallel.org
Surviving in the wild ecosystem
– Industry/Academia– Learning curve/Rapid change– Principles/Practices– Teaching/Research
csinparallel.org
Surviving in the wild ecosystem
– Industry/Academia– Learning curve/Rapid change– Principles/Practices– Teaching/Research
• Mine from the heritage of the past• Incremental approach• Spiral exposition• Pattern-based methods
csinparallel.org
An example
(go-lang.org)
• Mine from the heritage of the past– Hoare’s CSP; CCS Pi Calculus [teach/research]
• Incremental approach– Not far from C [academic/industry]
• Spiral exposition– Midway steps towards explicit threads, message passing
[Learning curve/rapid change]• Pattern-based methods
– Message passing, Fork-join, channel as Parallel Queue¨[Principles/Practice]
csinparallel.org
Questions?