libHPC: Software sustainability and reuse through metadata preservation
-
Upload
softwarepractice -
Category
Documents
-
view
453 -
download
4
description
Transcript of libHPC: Software sustainability and reuse through metadata preservation
![Page 1: libHPC: Software sustainability and reuse through metadata preservation](https://reader031.fdocuments.in/reader031/viewer/2022020217/549844f3b47959654d8b53ac/html5/thumbnails/1.jpg)
libHPC: Software sustainability and reuse through metadata preservation Jeremy Cohen, John Darlington, Brian Fuchs London e-Science Centre / Department of Computing, Imperial College London
David Moxey, Chris Cantwell, Pavel Burovskiy, Spencer Sherwin Department of Aeronautics, Imperial College London
Neil Chue Hong Software Sustainability Institute, University of Edinburgh First Workshop on Maintainable Software Practices in e-Science, Chicago Tuesday 9th October 2012
![Page 2: libHPC: Software sustainability and reuse through metadata preservation](https://reader031.fdocuments.in/reader031/viewer/2022020217/549844f3b47959654d8b53ac/html5/thumbnails/2.jpg)
Introduction
• Decision making – building scientific software can be hard
• Abstraction – hide the complexity
• Efficiency – achieve the performance
• Aim for a universal technology that spans all application
domains, machines, metrics
• Coordination forms – a different approach to task
specification
• Components – encapsulated building blocks
Machines
Applications
Metrics
ClusterCloudMulti-core
GPUFPGA
Time
Cost
Energy
Num.Intensive
Data Intensive
Bioinformatics
CFD
![Page 3: libHPC: Software sustainability and reuse through metadata preservation](https://reader031.fdocuments.in/reader031/viewer/2022020217/549844f3b47959654d8b53ac/html5/thumbnails/3.jpg)
Information and decisions
Why is software development and re-use hard?
• A particular piece of code is the result of many development decisions
• Developers invest significant knowledge about the task to be solved
…however…
• Decisions made by developers cannot be reconstructed from the code
• Loss of original information and structure invested by developer(s)
![Page 4: libHPC: Software sustainability and reuse through metadata preservation](https://reader031.fdocuments.in/reader031/viewer/2022020217/549844f3b47959654d8b53ac/html5/thumbnails/4.jpg)
Information and decisions
Understanding code structure and the options available and the decisions made during development is important:
• Portability; optimisation on different architectures
• Long-term sustainability
Need an explicit representation of decisions and alternatives:
• Decision tree used to represent this (structure)
• Metadata used to annotate decision tree (information)
• Modifications can be made to decision tree (based on metadata analysis) which can than be mapped to modified code
![Page 5: libHPC: Software sustainability and reuse through metadata preservation](https://reader031.fdocuments.in/reader031/viewer/2022020217/549844f3b47959654d8b53ac/html5/thumbnails/5.jpg)
Information and decisions
e.g. code that uses a solver:
• Many options to select suitable solver – abstract components
• Choice dependent on problem being addressed, parameters, etc.
• Represent solver choice on a tree of component alternatives, leaf nodes are concrete implementations higher-level nodes are abstract
Linear Solver"
Jacobi"LU"
Matrix
Vector Vector
Matrix Vector Vector Matrix
Vector Vector
Sequential LU" Parallel LU"(OpenMP)"
Parallel LU"(MPI)"
Sequential Jacobi"
Parallel Jacobi (UPC)"
![Page 6: libHPC: Software sustainability and reuse through metadata preservation](https://reader031.fdocuments.in/reader031/viewer/2022020217/549844f3b47959654d8b53ac/html5/thumbnails/6.jpg)
Abstractions
a Encapsulation
Encapsulate functions as components (reuse)
Allow alternatives
a Functional properties
Referentially transparent a Encapsulation
Church-Rosser a Alternative behaviours
![Page 7: libHPC: Software sustainability and reuse through metadata preservation](https://reader031.fdocuments.in/reader031/viewer/2022020217/549844f3b47959654d8b53ac/html5/thumbnails/7.jpg)
Abstractions – alternative behaviours
i.e. Church-Rosser
(4 + 3) – (2 + 1)
7 – 3
4
7 – (2 + 1) (4 + 3) – 3
![Page 8: libHPC: Software sustainability and reuse through metadata preservation](https://reader031.fdocuments.in/reader031/viewer/2022020217/549844f3b47959654d8b53ac/html5/thumbnails/8.jpg)
Application flow and specification
We represent application elements using two techniques
• Data processing – core code that forms application building blocks
a Components (first-order functions)
• Control flow, orchestration
a High-order functions
a Coordination Forms
e.g. Pipe, Parallel, Map / Reduce, …
![Page 9: libHPC: Software sustainability and reuse through metadata preservation](https://reader031.fdocuments.in/reader031/viewer/2022020217/549844f3b47959654d8b53ac/html5/thumbnails/9.jpg)
• A functional/mathematical approach to job specification
• Based on work by Darlington, et al.
• Applied to components – define application flow
• May be:
• General – applicable to most applications – e.g. PIPE, PAR
• Iterative patterns – e.g FARM, ITERATE
• Domain-specific higher-level forms – e.g. Monte Carlo
• Extensible – new patterns can be introduced
Coordination Forms
J. Darlington, Y. Guo, H. W. To and J. Yang. Functional skeletons for parallel coordination. In proceedings of EURO-PAR ’95 Parallel Processing, LNCS 966/1995, p. 55-66, 1995. Springer Berlin/Heidelberg
![Page 10: libHPC: Software sustainability and reuse through metadata preservation](https://reader031.fdocuments.in/reader031/viewer/2022020217/549844f3b47959654d8b53ac/html5/thumbnails/10.jpg)
• A given form may have multiple underlying implementations
• E.g. PAR may provide sequential, multi-threaded and MPI parallel implementations
• Forms aim to be as lightweight as possible
• They result in code that can be run
• They intelligently glue together component building blocks
• PIPE as an example – functions f1 to fn with initial input a:
PIPE [ f1, f2,…fn ]a = (f1 ° f2 ° … fn)a
= f1(f2 (… (fn(a))))
Coordination Forms
![Page 11: libHPC: Software sustainability and reuse through metadata preservation](https://reader031.fdocuments.in/reader031/viewer/2022020217/549844f3b47959654d8b53ac/html5/thumbnails/11.jpg)
PIPE ([component list], initial input)
PAR ([component list], [(input1), (input2), …, (inputn)])
Coordination Forms – Impementation
• Prototype implementation in Python • Class wrappers for component and parameter metadata –
concrete implementation code selectable
PIPE – Compose a series of components in the order specified
PAR – Run a series of components independently (perhaps in parallel)
Additional parameters can be added in component list
E.g. for components add, multiply, divide:
2 * ( (245+34) / (6+8) )
PIPE([(multiply, 2), divide, PAR([add,add],[(245,34),(6,8)])])
![Page 12: libHPC: Software sustainability and reuse through metadata preservation](https://reader031.fdocuments.in/reader031/viewer/2022020217/549844f3b47959654d8b53ac/html5/thumbnails/12.jpg)
Bioinformatics: Genome Read Pre-Processing/Mapping Reference Genome
FASTA file
Short Read Set (Paired)
Single FASTQ file
FASTQ splitbwa index
bwa aln bwa aln
SR_1 SR_2
bwa sampe - generate alignment (paired ended)
samtools import
FAST
A file
+ in
dex
file
SAM file
BAM file
samtools sort
sorted BAM file
samtools index
OUTPUT
Input files – Reference Genome – FASTA file Reads from sequencing machine - FASTQ
((sr1,sr2), u) = PAR([fastq_split, bwa_index], [(short_read_file, None, None),(ref_genome_file,)])
(v, w) = PAR([bwa_aln, bwa_aln], [(ref_genome_file, sr1, None), (ref_genome_file, sr2, None)])
result = PIPE([samtools_index, samtools_sort, (samtools_import, ref_genome_file), bwa_sampe],
[ref_genome_file, [v,w], [sr1, sr2], None])
![Page 13: libHPC: Software sustainability and reuse through metadata preservation](https://reader031.fdocuments.in/reader031/viewer/2022020217/549844f3b47959654d8b53ac/html5/thumbnails/13.jpg)
LibHPC Project
• LibHPC
• Two year project under EPSRC HPC Software Programme
• Imperial College London (Computing (LeSC), Aeronautics, ICT)
• SSI, Edinburgh
• Implementing/demonstrating framework with main supporting application (Nektar++) + other exemplars
![Page 14: libHPC: Software sustainability and reuse through metadata preservation](https://reader031.fdocuments.in/reader031/viewer/2022020217/549844f3b47959654d8b53ac/html5/thumbnails/14.jpg)
Example
Optimising FEM Codes
High-level Application Description / Job Specification(Co-ordination Forms, DSLs, etc.)
Job Specification Analysis/Processing
Hardware Resources
Software Component Library & Metadata Resource
Discovery & Metadata
Domain-specificApplication Support
Libraries
![Page 15: libHPC: Software sustainability and reuse through metadata preservation](https://reader031.fdocuments.in/reader031/viewer/2022020217/549844f3b47959654d8b53ac/html5/thumbnails/15.jpg)
Nektar++ - Hybrid Assembly
• Nektar++ operates on matrices based on input mesh
• Each element of input mesh is mapped to an (elemental) matrix
• There are two matrix assembly strategies:
• Local
• Global
![Page 16: libHPC: Software sustainability and reuse through metadata preservation](https://reader031.fdocuments.in/reader031/viewer/2022020217/549844f3b47959654d8b53ac/html5/thumbnails/16.jpg)
Nektar++ - Hybrid Assembly
=
=
=
1
=
=
=
1
Local Assembly Global Assembly
![Page 17: libHPC: Software sustainability and reuse through metadata preservation](https://reader031.fdocuments.in/reader031/viewer/2022020217/549844f3b47959654d8b53ac/html5/thumbnails/17.jpg)
Nektar++ - Hybrid Assembly
=
=
=
1
Hybrid Assembly
![Page 18: libHPC: Software sustainability and reuse through metadata preservation](https://reader031.fdocuments.in/reader031/viewer/2022020217/549844f3b47959654d8b53ac/html5/thumbnails/18.jpg)
Thank You