Heterogeneous Parallelization for RNA Structure Comparison Eric Snow, Eric Aubanel, and Patricia...

1
Heterogeneous Parallelization for RNA Structure Comparison Eric Snow, Eric Aubanel, and Patricia Evans University of New Brunswick Faculty of Computer Science Purpose of Sequential Algorithm: Given two RNA chains and their chemical bonds, find the maximum common ordered substructure (MCOS) formed by the bonds, such that the order of the bonds is maintained. Example of finding the MCOS of a pair of RNA chains (matched arcs share the same colour, unmatched are black). While the general form of the maximum common ordered substructure problem is known to be NP- complete, by restricting the allowable bonds to only those consistent with actual RNA (including pseudoknots and pseudoknot-like structures), a working algorithm was created that had worst case O(n 10 ) time and O(n 8 ) space requirements [1]. Implementation •Recursive dynamic programming using memoization. •One 4- and one 8-dimensional dynamic programming table. •Selective allocation/calculation performed, so that only values consistent with the input Abstract Parallel and grid computing have had a significant impact on the feasibility of running many time- and space-intensive applications. However, the additional computing power supplied is not without additional costs; development of algorithms and applications that run efficiently in parallel is generally more complex than traditional sequential development. In particular, consideration of the underlying heterogeneity of the computing platform can be very difficult when using multicomputers or grid computing, but can lead to significantly more efficient parallel applications. In this work, we analyze a sequential dynamic programming algorithm for the comparison of RNA structures including those with pseudoknots, and come up with a corresponding parallel design. Particular attention is paid to the fact that the underlying platform which will be used to test the algorithm will be heterogeneous. Sequential Algorithm Analysis Dynamic programming dependency analysis showed that calculations of any partial result in the dynamic programming tables has the following characteristics: Polyadic - Each calculation must call on at least 2 other partial results. Non-serial - Each calculation can require calls to partial results 1 or more levels away in the recursive graph of the problem. Irregular Dependency - The number of calls required in each calculation is not static; calculations in the 4-dimensional table can require between 2 and 4 calls, while calculations in the 8-dimensional table require between 4 and 13 calls. Example of a recursive graph of a point in a non-serial, polyadic dynamic programming problem with irregular dependencies (only the final two levels’ dependencies are completely shown). The unpredictability of the recursions based on the three characteristics above make this problem among the most difficult type to parallelize, due to the unpredictability of communication. Data Structures The major data structures are the 4- and 8-dimensional tables. Due to the selective allocation used, the “shape” of the tables can be very odd, with only two dimensions’ sizes being predictable ahead of time (in the case of the 8-dimensional table, this leaves 6 dimensions that must have their sizes calculated each time a new hyperplane is allocated). The image below illustrates the odd shape that can take place in the 4-dimensional table in the 2nd and 4th dimensions. Parallel Algorithm Design Overall Design •Modified master-slave paradigm. •Master responsible for load balancing, slaves responsible for task creation and evaluation. •New tasks stored in each slave’s task list prior to evaluation. Data Structures •4-d and 8-d tables distributed among slaves. •Due to unpredictable “shape”, a best guess initial division of tables takes place, and is improved during load balancing. •Division takes place along first dimension since its limit is always n (the length of the first RNA chain), and a single array can be used to hold data structure mapping information on the master. Example of initial data structure distribution for 4 processors with n = 11. Load Balancing •Dynamic load balancing used, based on excess data structure size or computational demand •Master responsible for load-balancing decisions based on periodic updates from slaves. •Support for heterogeneous platforms is encapsulated in load balancing procedures; master node must have information about the underlying platform, including network characteristics and processors’ available memory. Communication •Master-slave messages are used only to exchange load balancing information (generally very small in size), and must be passed synchronously to avoid out-of-date load balancing information being used by slaves. •Slave-slave communication can take place on two fronts: –Task evaluation requests and results: short, asynchronous messages. –Load balancing communications: large, synchronous messages. indicate synchronous communication. Slave 2 Slave 1 Master Performance/Data Structure Information Task Evaluation Requests/Results Table/Task Data (Load Balancing) Load Balancing Updates Slave 2 Slave 1 Master Slave 3 1 1 1 2 2 2 2 3 3 3 3 0 1 2 0 1 2 3 0 1 2 3 Local Index 0 1 2 3 4 5 6 7 8 9 10 Global Index Data Structure-to-Processor Mapping Array RNA Chain 1 RNA Chain 2 Maximum Common Ordered Substructure Result n = 3 m = 2 Conclusions and Remaining Work In this work we have described the design for a parallel version of the MCOS algorithm for RNA chains with pseudoknots and pseudoknot-like structures. Remaining work on this project includes the completion of the parallel implementation, and testing to ensure both correctness and efficiency. In addition, testing on the heterogeneous platforms available through ACEnet (www.ace-net.ca ) will take lace to test the effectiveness of the heterogeneous platform-specific features. References [1] P. Evans. Finding Common RNA Pseudoknot Structures in Polynomial Time. Combinatorial Pattern Matching, Lecture Notes in Computer Science. 2006, pp. 223-232.

Transcript of Heterogeneous Parallelization for RNA Structure Comparison Eric Snow, Eric Aubanel, and Patricia...

Page 1: Heterogeneous Parallelization for RNA Structure Comparison Eric Snow, Eric Aubanel, and Patricia Evans University of New Brunswick Faculty of Computer.

Heterogeneous Parallelization for RNA Structure ComparisonEric Snow, Eric Aubanel, and Patricia Evans

University of New Brunswick Faculty of Computer Science

Finding Common RNA Pseudoknot StructuresPurpose of Sequential Algorithm: Given two RNA chains and their chemical bonds, find the maximum common ordered substructure (MCOS) formed by the bonds, such that the order of the bonds is maintained.

Example of finding the MCOS of a pair of RNA chains (matched arcs share the same colour, unmatched are black).

While the general form of the maximum common ordered substructure problem is known to be NP-complete, by restricting the allowable bonds to only those consistent with actual RNA (including pseudoknots and pseudoknot-like structures), a working algorithm was created that had worst case O(n10) time and O(n8) space requirements [1].

Implementation•Recursive dynamic programming using memoization.•One 4- and one 8-dimensional dynamic programming table.•Selective allocation/calculation performed, so that only values consistent with the input structures would be calculated.•Only an average of 10-14 of the worst-case space used.

AbstractParallel and grid computing have had a significant impact on the feasibility of running many time- and space-intensive applications. However, the additional computing power supplied is not without additional costs; development of algorithms and applications that run efficiently in parallel is generally more complex than traditional sequential development. In particular, consideration of the underlying heterogeneity of the computing platform can be very difficult when using multicomputers or grid computing, but can lead to significantly more efficient parallel applications.

In this work, we analyze a sequential dynamic programming algorithm for the comparison of RNA structures including those with pseudoknots, and come up with a corresponding parallel design. Particular attention is paid to the fact that the underlying platform which will be used to test the algorithm will be heterogeneous.

Sequential Algorithm AnalysisDynamic programming dependency analysis showed that calculations of any partial result in the dynamic programming tables has the following characteristics:•Polyadic - Each calculation must call on at least 2 other partial results.•Non-serial - Each calculation can require calls to partial results 1 or more levels away in the recursive graph of the problem. •Irregular Dependency - The number of calls required in each calculation is not static; calculations in the 4-dimensional table can require between 2 and 4 calls, while calculations in the 8-dimensional table require between 4 and 13 calls.

Example of a recursive graph of a point in a non-serial, polyadic dynamic programming problem with irregular dependencies (only the final two levels’ dependencies are completely shown).

The unpredictability of the recursions based on the three characteristics above make this problem among the most difficult type to parallelize, due to the unpredictability of communication.

Data Structures

The major data structures are the 4- and 8-dimensional tables. Due to the selective allocation used, the “shape” of the tables can be very odd, with only two dimensions’ sizes being predictable ahead of time (in the case of the 8-dimensional table, this leaves 6 dimensions that must have their sizes calculated each time a new hyperplane is allocated). The image below illustrates the odd shape that can take place in the 4-dimensional table in the 2nd and 4th dimensions.

Parallel Algorithm DesignOverall Design•Modified master-slave paradigm.•Master responsible for load balancing, slaves responsible for task creation and evaluation.•New tasks stored in each slave’s task list prior to evaluation.

Data Structures•4-d and 8-d tables distributed among slaves.•Due to unpredictable “shape”, a best guess initial division of tables takes place, and is improved during load balancing.•Division takes place along first dimension since its limit is always n (the length of the first RNA chain), and a single array can be used to hold data structure mapping information on the master.

Example of initial data structure distribution for 4 processors with n = 11.

Load Balancing•Dynamic load balancing used, based on excess data structure size or computational demand•Master responsible for load-balancing decisions based on periodic updates from slaves. •Support for heterogeneous platforms is encapsulated in load balancing procedures; master node must have information about the underlying platform, including network characteristics and processors’ available memory.

Communication•Master-slave messages are used only to exchange load balancing information (generally very small in size), and must be passed synchronously to avoid out-of-date load balancing information being used by slaves.•Slave-slave communication can take place on two fronts:

–Task evaluation requests and results: short, asynchronous messages.–Load balancing communications: large, synchronous messages.

Example of communication patterns for 3 processors, where green arrows indicate asynchronous communication and red arrows indicate synchronous communication.

Slave 2Slave 1

Master

Performance/Data Structure Information

Task Evaluation Requests/Results

Table/Task Data (Load Balancing)

Load Balancing Updates

Slave 2Slave 1

Master

Slave 3

1 1 1 2 2 2 2 3 3 3 3

0 1 2 0 1 2 3 0 1 2 3Local Index

0 1 2 3 4 5 6 7 8 9 10Global Index

Data Structure-to-Processor Mapping Array

RNA Chain 1

RNA Chain 2

Maximum CommonOrdered Substructure

Result

n = 3

m = 2Conclusions and Remaining Work

In this work we have described the design for a parallel version of the MCOS algorithm for RNA chains with pseudoknots and pseudoknot-like structures.

Remaining work on this project includes the completion of the parallel implementation, and testing to ensure both correctness and efficiency. In addition, testing on the heterogeneous platforms available through ACEnet (www.ace-net.ca) will take lace to test the effectiveness of the heterogeneous platform-specific features.

References

[1] P. Evans. Finding Common RNA Pseudoknot Structures in Polynomial Time. Combinatorial Pattern Matching, Lecture Notes in Computer Science. 2006, pp. 223-232.