Parallel Computing · Parallel Computer Networks Bus – simple, cheap, poor communication...

Post on 29-May-2020

2 views 0 download

Transcript of Parallel Computing · Parallel Computer Networks Bus – simple, cheap, poor communication...

Parallel Computing

Benson Muite

benson.muite@ut.eehttp://math.ut.ee/˜benson

https://courses.cs.ut.ee/2014/paralleel/fall/Main/HomePage

22 September 2014

Document Preparation: LaTeX and Lyx

• https://en.wikibooks.org/wiki/LaTeX

• http://texblog.org/about/

• http://www.latex-project.org/

• http://www.lyx.org/

Computer Architecture

Parallel Computer Architecture

• Chip Architecture Review• Accelerators• Graphics Cards• Intel Xeon Phi• Parallel Computer Networking• CM3• The Earth Simulator• IBM Blue Gene• K computer• Titan• Tianhe II

Chip Architecture Review

• Typical chip today has multiple cores• Data may need to be obtained from a hard disk, RAM or

cache before being processed• For many applications getting data can be more of a

constraint than computing the data

Example HPC Chip Architectures

• Intel Haswell• AMD Opteron• SPARC64 XIfx• NEC SX-ACE• IBM Power 8• IBM PowerPC A2• Hotchips (http://www.hotchips.org/), Coolchips

(http://www.coolchips.org/2015/)

Accelerators

• External specialized device for floating point operations• Typically good at doing many simplified instructions in

parallel• High latency is compensated by high bandwidth

Graphics Cards and General Purpose Computing onGraphics Cards

• Nvidia – many simple cores, CUDA, CUDA Fortran, OpenACC, OpenCL and OpenGL application programminginterfaces, strong support of academic community

• AMD – many simple cores, Open CL and OpenGL. Havelaunched APU (Accelerated Processing Unit) whichcombines CPU and GPU

• Embedded graphics cards in AMD APU, Cell phone chips,such as Qualcomm snapdragon

Intel Xeon Phi

• 1Tflop of performance• Mini-supercomputer in a compute card• Simplified x86 cores• Typically easy to get code to run, more difficult to get code

to run efficiently

Parallel Computer Networks

• Bus – simple, cheap, poor communication performance• Ring – simple, cheap, poor communication performance• Mesh – simple, more expensive than ring, better

communication performance than ring• Hypercube – good communication performance, expensive

at a large scale• Torus 2D, 3D, 4D, 6D – good communication performance,• Fat tree – Commonly used, not quite as good performance

as a torus, but cheaper• Which topology is cost effective for a monte carlo

simulation?• What is the topology of Rocket?

Parallel Computer Networks

• http://htor.inf.ethz.ch/research/topologies/

CM-5• http://people.csail.mit.edu/bradley/cm5/,

https://en.wikipedia.org/wiki/Connection_Machine

Figure: NAS Thinking Machines CM-5, photographer: TomTrower, 1993 (This is probably a 256 processor machine.)

• 131 Gflops on 1024 processors• World’s most powerful known computer in June 1993• Fat tree topology network• Thinking Machines grew out of Danny Hills doctoral

research, but is no longer producing supercomputers

The Earth Simulator• https://en.wikipedia.org/wiki/Earth_Simulator

http://www.jamstec.go.jp/ceist/avcrg/index.en.html

Figure: Old Earth Simulator Figure: Earth Simulator 2

• 35.86 Tflops on 5120 processors• World’s most powerful known computer between March

2002 and November 2004• Vector processors• Five times faster than previous first computer on Top500

IBM Blue Gene L• https://en.wikipedia.org/wiki/Blue_Gene#Blue_Gene.2FL

https://asc.llnl.gov/computing_resources/bluegenel/photogallery.html

Figure: Adam Bertsch next to a Blue Gene L system atLawrence Livermore National Laboratories

• 596 Tflops on 106,496 dual core processors• World’s most powerful known computer between

November 2004 and November 2007• 3D torus and many not so fast cores• More at

https://asc.llnl.gov/computing_resources/bluegenel/configuration.html

K computer• https://en.wikipedia.org/wiki/K_computer

http://www.aics.riken.jp/en/outreach/photo-gallery/

Figure: K computer at RIKEN, picture courtesy of RIKEN.

• Currently 10.5 Pflops on 88,128 SPARC64 VIIIfxprocessors with 8 cores per processor

• World’s most powerful known computer between June2011 and June 2012

• 6D “mesh/”torus network and many fast and smart cores• More at http://www.aics.riken.jp/en/k-computer/system

Titan• https://en.wikipedia.org/wiki/Titan_%28supercomputer%29

https://www.olcf.ornl.gov/

Figure: Titan Supercomputer at Oak Ridge National Laboratory

• 27 Pflops on 18,688 AMD Opteron 6274 16-core CPUsand 18,688 Nvidia Tesla K20X GPUs

• World’s most powerful known computer betweenNovember 2012 and June 2013

• More at https://www.olcf.ornl.gov/computing-resources/titan-cray-xk7/

Tianhe II

• https://en.wikipedia.org/wiki/Tianhe-2 https://www.olcf.ornl.gov/

• https://duckduckgo.com/?q=tianhe+II+pictures

• 33.86 Pflops on 32,000 Intel Xeon E5-2692 chips with48,000 Xeon Phi 31S1P coprocessors

• Fat tree topology, American chips, but Fat tree topologyInterconnect is made in China

• World’s most powerful known computer• More at

www.netlib.org/utk/people/JackDongarra/PAPERS/tianhe-2-dongarra-report.pdf

Summary

• Supercomputer architectures are still evolving• Depending on the problem you are solving, the best choice

of computer architecture and algorithm should be made ifpossible

• In many cases, you have no choice in the computerarchitecture of a supercomputer, but do have some choicein the algorithm

• Sometimes you are lucky and can choose both, but mayneed to write a lot of code

New Key Concepts and References

• Parallel Computer Architecture; RR 2.1-2.3• Rahman, R. Intel Xeon Phi Coprocessor Architecture

and Tools: The Guide for Application Developers,Apress Open, (2013) $0.35 on Amazon

• T. Hoefler “Networking and Computer Architecture”http://htor.inf.ethz.ch/teaching/CS498/

• A. Grama, A. Gupta, G. Karypis, V. Kumar, Introduction toParallel Computing, 2nd Ed., Addison Wesley (2003)

• Wang, E., Zhang, Q., Shen, B., Zhang, G., Lu, X., Wu, Q.,Wang, Y. High-Performance Computing on theIntel R©Xeon PhiTM, Springer (2014) http://www.springer.com/computer/communication+networks/book/978-3-319-06485-7?otherVersion=978-3-319-06486-4