Post on 10-May-2020
Parallel Computing
Introduction
2010@FEUP Introduction 2
What is Parallel and Distributed computing?
• Solving a single problem faster using multiple
CPUs
• E.g. Matrix Multiplication C = A x B
• Parallel = Shared Memory among all CPUs
• Distributed = Local Memory/CPU
• Common Issues: Partition, Synchronization,
Dependencies, load balancing
2010@FEUP Introduction 3
Why Parallel and Distributed Computing?
• Grand Challenge Problems
• Weather forecasting; climate change
• Drug discovery
• Physical limitations of circuits
• Heat
• Light velocity limitation
• Cost
• Multiple average CPUs cost less than a high
end one
2010@FEUP Introduction 4
Microprocessor Revolution
Micros
Minis
Mainframes
Speed (log scale)
Time
Supercomputers
2010@FEUP Introduction 5
Shared Memory Programming
• Easier conceptual environment
• Programmers typically familiar with concurrent
threads and processes sharing address space
• CPUs within multi-core chips share memory
• OpenMP an application programming interface
(API) for shared-memory systems
• Supports higher performance parallel programming of
symmetrical multiprocessors
2010@FEUP Introduction 6
MPI / PVM
• MPI = “Message Passing Interface”
• PVM = “Parallel Virtual Machine”
• Standard specification for message-
passing libraries
• Libraries available on virtually all parallel
computers
• Free libraries also available for networks
of workstations, commodity clusters,
Linux, Unix, and Windows platforms
• Can program in C, C++, and Fortran
2010@FEUP Introduction 7
Classical versus Modern Science
Nature
Observation
TheoryPhysical
Experimentation
Numerical
Simulation
2010@FEUP Introduction 8
60 Years of Speed Increases
ENIAC
350 flops
1946
Today - 2009
1 Peta flops
Roadrunner -
Los Alamos NL
116,640 cores 12K IBM cell
2010@FEUP Introduction 9
Seeking Concurrency
• Data dependence graphs
• Data parallelism
• Functional parallelism
• Pipelining
2010@FEUP Introduction 10
Data Dependence Graph
• Directed graph
• Vertices = tasks
• Edges = dependencies
Taking care of a garden in a residence
2010@FEUP Introduction 11
Data Parallelism
• Independent tasks apply same operation
to different elements of a data set
• Okay to perform operations concurrently
• Speedup: potentially p-fold, p=#processors
for i 0 to 99 do
a[i] b[i] + c[i]
endfor
2010@FEUP Introduction 12
Functional Parallelism
• Independent tasks apply different
operations to different data elements
• First and second statements
• Third and fourth statements
• Speedup: Limited by amount of concurrent
sub-tasks
a 2
b 3
m (a + b) / 2
s (a2
+ b2) / 2
v s - m2
2010@FEUP Introduction 13
Pipelining
• Divide a process into stages
• Produce several items simultaneously
• Speedup: Limited by amount of concurrent
sub-tasks = #of stages in the pipeline
2010@FEUP Introduction 14
Programming Parallel Computers
• Extend compilers: translate sequential
programs into parallel programs
• Extend languages: add parallel operations
• Add parallel language layer on top of
sequential language
• Define totally new parallel language and
compiler system
2010@FEUP Introduction 15
• Low-level approach is most popular
• Augment existing language with low-level
parallel constructs
• MPI, PVM, threads/process-based concurrency
and OpenMP are examples
• Advantages of low-level approach
• Efficiency
• Portability
• Disadvantage: More difficult to program
and debug
Current Status