TERASCALE LINUX CLUSTERS: SUPERCOMPUTING SOLUTIONS FOR LIFE SCIENCES MARCH 27,2003 Dr. Erwin Frise,...
-
Upload
matilda-annabelle-mathews -
Category
Documents
-
view
216 -
download
1
Transcript of TERASCALE LINUX CLUSTERS: SUPERCOMPUTING SOLUTIONS FOR LIFE SCIENCES MARCH 27,2003 Dr. Erwin Frise,...
TERASCALE LINUX CLUSTERS:
SUPERCOMPUTING SOLUTIONS FOR LIFE SCIENCES
MARCH 27,2003
Dr. Erwin Frise, Berkeley National LaboratoryDr. Padmanabhan Iyer, Linux Networx
“It took millions of times more computing power to map the human genome than it did to land a man on the moon, and that’s only a fraction of what’s needed right now. The biggest challenge in biology is going to be computing.”
Dr. J. Craig VenterChairman of the Board
The Institute for Genomic Research Dec. 2001
The Need For Computing
Clusters : What and Why?
What? Collection of computers networked together to
perform a particular application in parallel
Why? Clusters can be built to offer performance to match
the needs of virtually any application (scalability) Clusters can be built out of commodity,
components (cost effectiveness)
Role of Your Cluster Vendor
Expertise with cluster management Allow you to concentrate on the
complexities of your models – not your computer system
Expertise with turnkey solutions Open Source distribution Proven compatibility with life sciences
applications
One-stop support
GFLOPS to TFLOPS … in 8 Years!
Wiglaf Cluster
1994
COTS
Hrothgar Cluster
1 GigaFLOPS1 Billion
Floating-Point Operations per Second
1997 2001
10 GigaFLOPSJoining 2 Clusters
11 TeraFLOPS
250 GigaFLOPS
1st Commercial Linux Clusters Linux NetworX Brookhaven NL Los Alamos NL
19981995 1996 1999 2000 2002
1.5 TeraFLOPSTrillions of
Floating-Point Operations per Second
Alta Cluster
Lawrence Livermore NL PCR Cluster
LNXI Evolocity™ Clusters
LLNL MCR Cluster
Beowulf Class Clusters NASA Goddard Space Flight Center
Numbers Overwhelm Size
Trend to Clustered Supercomputers
Clusters are ideally suited for many life science applications:
High throughput screening runs involving multiple, repetitive sequence comparisons on vast amounts of data.
Data or query can be partitioned and dispatched to n nodes of a cluster to gain direct n-fold speedup and gain in throughput.
By employing standard parallelization tools such as MPI, even the most compute-intensive fine-grain applications (e.g., molecular chemistry) can be deployed on a cluster equipped with a high-performance interconnect fabric.
Clusters in the Life Sciences
Tularik, Incorporated San Francisco, California
Case Study
Because the mouse genome is very close to human, the effect of a specific drug agent can be tested on the mouse and pre- and post-treatment genome comparisons will help guage the effectiveness of therapeutic agents in regulating specific expression
- A compute-intensive task, but one that is easily parallelizable and highly suited for the cluster computing model
Situation Existing infrastructure would have taken 38
years to perform the 22 million genomic sequence comparisons
Solution Linux NetworX cluster of 150 Pentium-3 nodes
with ICEBox and ClusterWorX management
Results Study completed in 34 days – 450x acceleration
Enabled new business opportunities
Gave new competitive advantage
Tularik, Incorporated San Francisco, California
Case Study—Business Case
“Cluster management tools from Linux NetworX are setting the standard …. Without cluster management tools, we would be spending five times as long managing the cluster…. ICE tools allow us to concentrate on finding new genes that cause disease and not worry about cluster management”
- Gene Cutler Tularik, Inc.
Tularik, Incorporated San Francisco, California
Case Study— Business Success
Case Study— BDGP
Berkeley Drosophila Genome Project
(BDGP)
Building and using a cluster in a major genome center
BDGP Cluster
Annotation of the Drosophila genome
Research Transposable elements Aligning several Drosophila species RNA folding and predictions Hunt for microRNAs
Public blast server
BDGP Cluster: Annotation
Chris Mungall, ShengQiang Shu, Suzanna Lewis and many curators
Pipeline and GadFly database
PBS
Apollo
BOP
Sequence
Sequence FinishingGadflyPipeline
FTP directory
computefarm
BDGP Cluster: Annotation
Sim4
Sim4wrapBLASTN Sim4
Gadfly
BOP
Repeat-maskerBLASTX
TBLASTX GenscanGenie
tRNAscan-SE
Category Source
Release 2 transcripts Celera
ESTs BDGP plus dbEST
Complete cDNAs BDGP
Annotated reference Genome sequences (ARGS)
Genbank
Insertion flanking BDGP
Non-coding RNA FlyBase
Consensus sequences SWISS-PROT
Transposable elements BDGP
other invertebrates
SPTR primates
SPTR rodents
SPTR other vertebrates
SPTR C. elegans
SPTR S. cerevisiae
SPTR
plants SPTR
M. musculus UniGene
Insects dbEST
Drosophila melanogaster specific sequences
Peptide sequences from other species
Other species EST sequences
Repeat-maskerBLASTX
BDGP Cluster: Transposons
Josh Kaminker Finding repetitive regions Divide the genome into overlapping pieces BLAST everything vs. everything
(~1 millon blastn) Check for “transposon” features
BDGP Cluster: Transposons
BDGP Cluster: GOst
Brad Marshall & GO consortium GO = Gene Onthology GO blast Public service for finding blast matches to
GO annotated sequences Blast jobs submitted to the BDGP cluster http://godatabase.org
BDGP Cluster
2000: 20 nodes (dual 700MHz PIII) 2001: 32 nodes (added 12 dual 1GHz PIII) 2002: 52 nodes (added 20 dual Athlon 1800XP) approximately 8-10 millon jobs since 2000 GadFly Release 3 entirely based on it Drosophila Transposons GOst & future public blast/pipeline server Lots of future research
Talk Topics
“Beowulf” concept:Combine commodity hardware to create a supercomputer.Focus on long running large jobs which a split evenly among the nodes and on total system performance.
Emphasis on modifications of traditional Beowulf cluster for the requirements of genome science/bioinformatics
Bioinformatics Applications
Large computer splitable jobcomputer can split the job into equal segmentse.g. Data clustering, MEME, BLAST?= Traditional Beowulf
Large jobs with prior knowledge where to split“pipelines”: e.g BLAST, tRNAscan, Interpro,...
Genome wide/large scale application of programs not written for clusters: most programs
User submitted “small” jobse.g individual (batch) BLAST, gene-finding
Bioinformatics Applications
Traditional “Beowulf” jobsMandelbrot set Sequence alignment Painting eggsMandelbrot set Sequence alignment Painting eggs
Bioinformatics Applications
Large number of small jobsPipeline Painting eggs in different colors
Blast/Genscan
Repetitive elements
Bioinformatics Application Requirements
Compute farm for: Small jobs Large jobs Single/multithreaded jobs (blast) MPI/PVM jobs (Clustering, MEME)
Usually lots of files/databases
Cluster Building Problems
Hardware building (Linux Networx) Integration into infrastructure Running the programs on the cluster
Teaching users/programmers Adjusting the software
e.g. Creating a cluster aware pipeline, scripts submitting jobs asynchronously
BDGP Cluster Infrastructure
Head node (bam, “bag of marbles”) Gateway to the cluster Runs maintainance and monitoring software Job distribution and control
Nodes (52 nodes) Separate network (security), NIS replication from BDGP
network Dual CPU, local harddisk with large swap partition
Network interconnect 100BT Ethernet with Gigabit Ethernet uplink to NAS storage
StorageDedicated NAS storage (NetApp and OpenNAS)
BDGP Cluster Structure
NAS NASNASNAS
BamPublic network Cluster network
NAS
100BT
Gigabit
Running programs on the cluster
“Making the cluster available to the users”
On a cluster, programs (jobs) don't run efficiently in an interactive manner.
Using a queuing system to Run jobs in an organized manner Handle the requirements of users and their
bioinformatics tasks (large number of varying jobs)
Keep the nodes optimally loaded
Queuing Systemstblastx db_A a1
tblastx db_B b2
tblastx db_B b1
tblastx db_A a3
tblastx db_A a2
tblastx db_A a5
tblastx db_A a6
tblastx db_B b3
tblastx db_A a4
Job 6
Job 2
Job 5
Job 4
Job 3
Job 1
N1 N8N7N6N5N2 N3 N4
Server(headnode)
Research Pipeline
Queuing Systems
LSF
Sun Gridengine
PBSPro (Veridian) and OpenPBS at BDGP Originally/still now most frequently used system Very stable for production system Cross-platform Not too robust to node failures (OpenPBS) Open source but no central repository (OpenPBS) PBSPro free to academia only Not easy to configure/get running Job submission hard for non-techies Does not scale very well to many jobs in queue
OpenPBS Implementation at BDGP
Several fifo queues with queue policies (priority) MPI Pipeline Research
Overlapping node allocations for each queue
Queues for 1 CPU: 2 jobs per node 2 CPU: 1 mutithreaded job per node MPI jobs: job distributes itself over serveral
nodes
OpenPBS Problems
Sensitive to node failure because of TCP timeouts
Does not scale well beyond 10,000 jobs Job submission
OpenPBS Node Failure Patches
Modifications by E.F. to increase scheduler robustness
Set node “offline” when down CPLANT patches
http://www.cs.sandia.gov/cplant/doc/pbs/pbs.html
OpenPBS Scalability Patch
Written by E.F. for the BDGP pipeline Patched PBS scheduler
Everything is cached Very fast in running lots of small jobs even
when queues are full scales to >100,000 jobs in queue with single
650 Mhz PIII with 768MB, lots more with more current machine
Open source
OpenPBS Job Submission
PBS tools qsub/qstat Awkward to end user; requires shell script Most options Possibility of timeouts
pbsrsh/pbsquery Based on C++ object framework encapsuling
the PBS libraries rsh-like tool for easy job submission in the
command line: pbsrsh -a “blastn mydb myseq” Future: pipe jobs into pbsrsh to avoid timeouts
OpenPBS MPI
Designated mpi queue LAM 6.5.4 (http://www.lam-mpi.org) mpipbs Bourne shell script
Mpipbs mpirun -p __n mympi_program
Reads node information from PBS Creates appropriate LAM configuration Starts LAM on PBS allocated nodes Starts program with the appropriate processor
number Stops LAM
BDGP Cluster References
BDGP: http://www.fruitfly.org Presentation & BDGP cluster administration
software:http://www.fruitfly.org/~efrise/cluster.html
FlyBase: http://flybase.org GadFly3:
http://www.fruitfly.org/cgi-bin/annot/query Pipeline, Annotation and Transposon work:
GenomeBiology 2002, 3(12)http://genomebiology.com/drosophila/
Acknowledgements
Gerry Rubin (Director BDGP and VP HHMI) Suzanna Lewis (Director informatics) Sue Celniker (Director genomics) Eric Smith (Co-administrator) Chris Mungall (Pipeline and Gadfly) Josh Kaminker (Transposon research) Brad Marshall (Gost)
Contact Information
Erwin Frise
Berkeley Drosophila Genome Project
Lawrence Berkeley National Labs
One Cyclotron Road, MS64-121
Berkeley, CA 94720
Tel. (510) 486-7251
http://www.fruitfly.org/~efrise/cluster.html
TERASCALE LINUX CLUSTERS:
SUPERCOMPUTING SOLUTIONS FOR LIFE SCIENCES
MARCH 27,2003
Dr. Erwin Frise, Berkeley National LaboratoryDr. Padmanabhan Iyer, Linux Networx