Open MPI on the XT3...What is Open MPI •Complete MPI-2 implementation •Designed for large-scale...
Transcript of Open MPI on the XT3...What is Open MPI •Complete MPI-2 implementation •Designed for large-scale...
![Page 1: Open MPI on the XT3...What is Open MPI •Complete MPI-2 implementation •Designed for large-scale clusters •Highly optimized datatype engine •Optimized collective routines •Run-time](https://reader035.fdocuments.in/reader035/viewer/2022070223/6143c7c16b2ee0265c02430b/html5/thumbnails/1.jpg)
Open MPI on the XT3
Brian Barrett, Ron Brightwell,Jeff Squyres, and
Andrew LumsdaineMay 11, 2006
![Page 2: Open MPI on the XT3...What is Open MPI •Complete MPI-2 implementation •Designed for large-scale clusters •Highly optimized datatype engine •Optimized collective routines •Run-time](https://reader035.fdocuments.in/reader035/viewer/2022070223/6143c7c16b2ee0265c02430b/html5/thumbnails/2.jpg)
Overview
• What is Open MPI?• Why is it running on the XT3?• How well does it run?• Lessons learned / Porting Issues• Future work
![Page 3: Open MPI on the XT3...What is Open MPI •Complete MPI-2 implementation •Designed for large-scale clusters •Highly optimized datatype engine •Optimized collective routines •Run-time](https://reader035.fdocuments.in/reader035/viewer/2022070223/6143c7c16b2ee0265c02430b/html5/thumbnails/3.jpg)
What is Open MPI
• Complete MPI-2 implementation• Designed for large-scale clusters• Highly optimized datatype engine• Optimized collective routines• Run-time loadable component architecture
Well defined abstraction points Simplifies customizing to a platform
![Page 4: Open MPI on the XT3...What is Open MPI •Complete MPI-2 implementation •Designed for large-scale clusters •Highly optimized datatype engine •Optimized collective routines •Run-time](https://reader035.fdocuments.in/reader035/viewer/2022070223/6143c7c16b2ee0265c02430b/html5/thumbnails/4.jpg)
Cluster Features
• Multiple NIC support with message striping• Message error detection and recovery• Process fault tolerance• MPI_THREAD_MULTIPLE support• Thread-based asynchronous progress• Rich run-time support for clusters
![Page 5: Open MPI on the XT3...What is Open MPI •Complete MPI-2 implementation •Designed for large-scale clusters •Highly optimized datatype engine •Optimized collective routines •Run-time](https://reader035.fdocuments.in/reader035/viewer/2022070223/6143c7c16b2ee0265c02430b/html5/thumbnails/5.jpg)
Open MPI Collaborators
![Page 6: Open MPI on the XT3...What is Open MPI •Complete MPI-2 implementation •Designed for large-scale clusters •Highly optimized datatype engine •Optimized collective routines •Run-time](https://reader035.fdocuments.in/reader035/viewer/2022070223/6143c7c16b2ee0265c02430b/html5/thumbnails/6.jpg)
Why port to the XT3?
• Application developers only wanted tocomplain about one MPI implementation
• Interesting “big iron” machine Sane network Developer experience on similar architecture
• Test framework abstractions
![Page 7: Open MPI on the XT3...What is Open MPI •Complete MPI-2 implementation •Designed for large-scale clusters •Highly optimized datatype engine •Optimized collective routines •Run-time](https://reader035.fdocuments.in/reader035/viewer/2022070223/6143c7c16b2ee0265c02430b/html5/thumbnails/7.jpg)
How well does it run?
• (Almost) Complete MPI-level support: MPI-2 Dynamics don’t work…
• Performance: Very good for first attempt Still needs to go faster
• We’ve identified the issues• Deciding how best to fix them
• Some performance numbers…
![Page 8: Open MPI on the XT3...What is Open MPI •Complete MPI-2 implementation •Designed for large-scale clusters •Highly optimized datatype engine •Optimized collective routines •Run-time](https://reader035.fdocuments.in/reader035/viewer/2022070223/6143c7c16b2ee0265c02430b/html5/thumbnails/8.jpg)
Performance (Latency)
8.50usOpen MPI
7.14usMPICH-2
5.30usNative Portals
1 Byte LatencyImplementation
![Page 9: Open MPI on the XT3...What is Open MPI •Complete MPI-2 implementation •Designed for large-scale clusters •Highly optimized datatype engine •Optimized collective routines •Run-time](https://reader035.fdocuments.in/reader035/viewer/2022070223/6143c7c16b2ee0265c02430b/html5/thumbnails/9.jpg)
Performance
![Page 10: Open MPI on the XT3...What is Open MPI •Complete MPI-2 implementation •Designed for large-scale clusters •Highly optimized datatype engine •Optimized collective routines •Run-time](https://reader035.fdocuments.in/reader035/viewer/2022070223/6143c7c16b2ee0265c02430b/html5/thumbnails/10.jpg)
Lessons Learned
• Cross-compilation challenging• Cluster run-time overkill
Component framework allows run-time to “getout of the way”
Surprising amount of work• Performance expectations
Designed around InfiniBand and Myrinet/GM Hardware matching growing pains
![Page 11: Open MPI on the XT3...What is Open MPI •Complete MPI-2 implementation •Designed for large-scale clusters •Highly optimized datatype engine •Optimized collective routines •Run-time](https://reader035.fdocuments.in/reader035/viewer/2022070223/6143c7c16b2ee0265c02430b/html5/thumbnails/11.jpg)
Portals Point-to-Point
• Choice of abstractionlayer for implementation
• BTL design chosen: Performance impact
thought to be low Quicker time to
completion One-sided support uses
BTL layer
![Page 12: Open MPI on the XT3...What is Open MPI •Complete MPI-2 implementation •Designed for large-scale clusters •Highly optimized datatype engine •Optimized collective routines •Run-time](https://reader035.fdocuments.in/reader035/viewer/2022070223/6143c7c16b2ee0265c02430b/html5/thumbnails/12.jpg)
Future Work
• Point-to-point performance Early PML work provides performance
comperable to MPICH-2 Myrinet/MX presents identical challenge Hope to find middle ground
• MPI topology functions• Collectives performance
Need tuning parameters for platform Topology awareness
![Page 13: Open MPI on the XT3...What is Open MPI •Complete MPI-2 implementation •Designed for large-scale clusters •Highly optimized datatype engine •Optimized collective routines •Run-time](https://reader035.fdocuments.in/reader035/viewer/2022070223/6143c7c16b2ee0265c02430b/html5/thumbnails/13.jpg)
More Information
• BTL implementation detailed in paper• Publicly available in Open MPI SVN
http://www.open-mpi.org/