Having a Blast! on DiaGrid Carol Song Rosen Center for Advanced Computing [email protected]...
-
Upload
elmer-black -
Category
Documents
-
view
212 -
download
0
Transcript of Having a Blast! on DiaGrid Carol Song Rosen Center for Advanced Computing [email protected]...
Having a Blast! on DiaGrid
Carol SongRosen Center for Advanced Computing
December 9, 2011
What is DiaGrid
• A large, high-throughput, distributed computing system• Operated by Rosen Center (RCAC)• Using Condor to manage jobs and resources• Good for running serial computations on a large
number of processors • Utilizing idle cycles• Including all Purdue clusters, lab computers,
department computers, desktop, totaling 50,000+ cores• Purdue leading a partnership of 10 campuses and
institutions
2004 2005 2006 2007 2008 2009 2010 2011 -
10,000
20,000
30,000
40,000
50,000
60,000
1,500 4,000
6,100 7,700
22,000
30,000
42,000
50,600
Growth of DiaGrid 2004-2011 (number of cores)
2005 2006 2007 2008 2009 2010 20110
50
100
150
200
250
300
25
70
115 115
163
145
276
8
27
5060
8579
89
511 16 13 18 20 15
411
19 18 16 16 13
DiaGrid User CountUnique users Unique PIsUnique PI depts Fields of Science
20042005
20062007
20082009
20102011
0
5
10
15
20
25
Computation on DiaGrid2004-2011
Jobs
HoursMill
ions
How to Compute on DiaGrid
• Direct access to Condor front end and use Linux command line– Software install,
configuration, copy input data…
– Documentation
Going forward … …
• DiaGrid + HUBzero• EASY!
– Free account sign-up– Instant access– Hosted applications– How-to videos, examples,
questions&answers to help you get started
– Social networking for research (group, wiki, discussion board, wish list, ticket system)
• BLAST!
Plan for BLASTPrototype (Dec.)• User interface for
running Blast in a browser
• Input: type in or upload a file (FASTA)
• Support blast flavors (blastn, blastx, blastp)
• Choice of publicly available Blast databases (NCBI), updated regularly
• Minimum input size supported: 10s of MBs
• User can see search status and % completed
• User can download output files
Release 1.0 (Feb-Mar)
• Ability for user to upload own databases
• Can select private databases for search
• Support larger input: 100s MBs
• Support more blast flavors
• Support output formats (?)
Release 2.0 • Support more
databases (swissprot? Ensembl? ) in additional to NCBI
• A tool for managing and sharing research sequence databases. For example, user can specify whether to make a database:• Public • Shared with a
group• private
We need your help
• Early adopters to help us improve the tools (sign-up sheet or email Carol)
• Use it and tell us your stories• Partnership in developing tools and the
communities around the applications