High Performance Computing on an IBM Cell Processor Team May08-24: Kyle Byerly Matt Rohlf Bryan...

1
High Performance Computing on an IBM Cell Processor Team May08-24: Kyle Byerly Matt Rohlf Bryan Venteicher Shannon McCormick Faculty Adviser: Team Website: Zhao Zhang http://seniord.ece.iastate.edu/may0824 Introduction Problem Statement Biological researchers are faced with ever increasing computational time due to the exponentially growing data needed to be processed. Currently commodity computing hardware is unable to provide adequate performance. User Interface Biologists and bioinformaticists will use the ported application the same way they would use the original, using the command line. Assumptions • User has access to a PlayStation 3 running Linux • User knows how to use original application Operating Environment • Dry • Temperature controlled (less than 70° F) Deliverables • Application ported to Cell/B.E. • Benchmarks to document performance improvement Project and Design Requirements Design Objective To parallelize and port a BioPerf application to the PlayStation 3 so that it takes full advantage of the performance of the Cell/B.E. Functional Requirements Nonfunctional Requirements • Algorithm must be parallelizable • Data must be able to be stored in the limited memory of the PlayStation 3 • Must run faster than the original Engineering Specification Input/Output: • Text of DNA sequence / Parsimonious tree Hardware: • PlayStation 3, Cell/B.E. Software: • Fedora Linux, DNAPenny 3.6 User Interface: • Command line Design Method & Results Design Method Two possible ways to parallelize DNAPenny will be explored: •Parallelize entire algorithm •Parallelize performance- critical section of algorithm Test Plan Created script to ensure ported application produces the same output as the original application with a wide variety of input files. Resources & Work Breakdown Work Breakdown Structure Financial Resources Other resources • Open source software packages (gcc, gdb, gprof, vim, gnuplot, ssh, bash, lxr, svn, viewvc, diff, cscope) • BioPerf suite (CLUSTALW, DNAPenny, and many others) • Sample input data from NCBI GenBank Item w/ labor w/o labor PlayStation 3 (donated) $0 $0 Estimated Labor (@ $10.00/hr) $5645 $0 Totals $5645 $0 Closing Summary The team has successfully ported DNAPenny to the Cell/B.E. The ported version of DNAPenny produces the same output for the same input faster than the original application running on a typical desktop PC. With the ported application, bioinformaticists will have a cheap and efficient way to analyze DNA sequences. • Ported application shall run on the Cell/B.E. • Ported application shall return the same results as the original application. • The running time of the ported application shall be recorded for comparison to the original application. The team believes that the Cell Broadband Engine (Cell/B.E.) found in the PlayStation 3 (PS3) will offer superior performance to commodity computing hardware at an affordable price. The team will port an application from the BioPerf suite to the Cell/B.E. running on Linux. BioPerf is a benchmark suite of representative bioinformatics applications for use with high-performance computing. Proposed Concept Sketch / System Description System Block Diagram The system block diagram below shows an overview of the project. The same input data is fed to two versions of the applications – the original code and the ported version – and identical output data is produced at a faster rate. 29% 23% 23% 25% Bryan Kyle S hannon Matt Total Hours = 564.5 Benchmarking Methods Created script to time the execution of significant revisions of the ported application and the original application. An additional script calculates the average run time and automatically generates graphs of the results. A few examples of the generated graphs are shown below. Literature Survey • V. Sachdeva, M. Kistler, E. Speight and T.-H. K. Tzeng, Exploring the Viability of the Cell Broadband Engine for Bioinformatics Applications , March 2007. • R. Desaraju, A Parallel Implementation Of A Parsimony- Based Method For Phylogenetic Inference , May 2005 Risks • Proposed implementation may not be faster • Other teams may complete the same work before the team does Prototype DNAPenny was ported to the Cell/B.E. taking advantage of the parallel nature of the hardware. Another parallelized version of DNAPenny was created that runs on a standard desktop PC. Parts / Vendor List •PlayStation 3 was provided by the department •All software used is open source Test Procedure / Results •Execute script and verify output has not changed with the original output by using the diff utility. •The output did not change.

Transcript of High Performance Computing on an IBM Cell Processor Team May08-24: Kyle Byerly Matt Rohlf Bryan...

Page 1: High Performance Computing on an IBM Cell Processor Team May08-24: Kyle Byerly Matt Rohlf Bryan Venteicher Shannon McCormick Faculty Adviser: Team Website:

High Performance Computing on an IBM Cell Processor

Team May08-24: Kyle Byerly

Matt Rohlf

Bryan Venteicher

Shannon McCormick

Faculty Adviser:

Team Website:

Zhao Zhang

http://seniord.ece.iastate.edu/may0824

Introduction

Problem StatementBiological researchers are faced with ever increasing computational time due to the exponentially growing data needed to be processed. Currently commodity computing hardware is unable to provide adequate performance.

User InterfaceBiologists and bioinformaticists will use the ported application the same way they would use the original, using the command line.

Assumptions• User has access to a PlayStation 3

running Linux• User knows how to use original application

Operating Environment• Dry• Temperature controlled (less than 70° F)

Deliverables• Application ported to Cell/B.E.• Benchmarks to document performance

improvement

Project and Design Requirements

Design ObjectiveTo parallelize and port a BioPerf application to the PlayStation 3 so that it takes full advantage of the performance of the Cell/B.E.

Functional Requirements

Nonfunctional Requirements• Algorithm must be parallelizable• Data must be able to be stored in the

limited memory of the PlayStation 3• Must run faster than the original

Engineering SpecificationInput/Output:• Text of DNA sequence / Parsimonious treeHardware:• PlayStation 3, Cell/B.E.Software:• Fedora Linux, DNAPenny 3.6User Interface:• Command line

Design Method & Results

Design MethodTwo possible ways to parallelize DNAPenny will be explored:

•Parallelize entire algorithm•Parallelize performance-critical sectionof algorithm Test Plan

Created script to ensure ported application produces the same output as the original application with a wide variety of input files.

Resources & Work Breakdown

Work Breakdown Structure Financial Resources

Other resources• Open source software packages (gcc, gdb,

gprof, vim, gnuplot, ssh, bash, lxr, svn, viewvc, diff, cscope)

• BioPerf suite (CLUSTALW, DNAPenny, and many others)

• Sample input data from NCBI GenBank

Item w/ labor w/o labor

PlayStation 3 (donated) $0 $0

Estimated Labor(@ $10.00/hr)

$5645 $0

Totals $5645 $0

Closing Summary

The team has successfully ported DNAPenny to the Cell/B.E. The ported version of DNAPenny produces the same output for the same input faster than the original application running on a typical desktop PC. With the ported application, bioinformaticists will have a cheap and efficient way to analyze DNA sequences.

• Ported application shall run on the Cell/B.E. • Ported application shall return the same

results as the original application.• The running time of the ported application

shall be recorded for comparison to the original application.

The team believes that the Cell Broadband Engine (Cell/B.E.) found in the PlayStation 3 (PS3) will offer superior performance to commodity computing hardware at an affordable price. The team will port an application from the BioPerf suite to the Cell/B.E. running on Linux. BioPerf is a benchmark suite of representative bioinformatics applications for use with high-performance computing.

Proposed Concept Sketch / System Description

System Block DiagramThe system block diagram below shows an overview of the project. The same input data is fed to two versions of the applications – the original code and the ported version – and identical output data is produced at a faster rate.

29%23%

23%25%

Bryan Kyle Shannon Matt

Total Hours = 564.5

Benchmarking MethodsCreated script to time the execution of significant revisions of the ported application and the original application. An additional script calculates the average run time and automatically generates graphs of the results.

A few examples of the generated graphs are shown below.

Literature Survey• V. Sachdeva, M. Kistler, E. Speight and

T.-H. K. Tzeng, Exploring the Viability of the Cell Broadband Engine for Bioinformatics Applications, March 2007.

• R. Desaraju, A Parallel Implementation Of A Parsimony-Based Method For Phylogenetic Inference, May 2005

Risks• Proposed implementation may not be

faster• Other teams may complete the same work

before the team does

PrototypeDNAPenny was ported to the Cell/B.E. taking advantage of the parallel nature of the hardware. Another parallelized version of DNAPenny was created that runs on a standard desktop PC.

Parts / Vendor List•PlayStation 3 was provided by the department•All software used is open source

Test Procedure / Results•Execute script and verify output has not changed with the original output by using the diff utility.•The output did not change.