BOINC The Year in Review David P. Anderson Space Sciences Laboratory U.C. Berkeley 22 Oct 2009.
Introduction to the BOINC software David P. Anderson Space Sciences Laboratory University of...
-
Upload
ashley-townsend -
Category
Documents
-
view
213 -
download
0
Transcript of Introduction to the BOINC software David P. Anderson Space Sciences Laboratory University of...
Introduction to the BOINC software
David P. Anderson
Space Sciences LaboratoryUniversity of California, Berkeley
Outline
• Abstractions
• The BOINC server software
• The BOINC client software and runtime system
Design goals
• A few applications, lots of jobs
• High performance
– millions of jobs per day
• Scalability
• Fault tolerance
Abstractions• Platform
• App version
– a collection of files, one of which is an executable main program
– associated with a platform
• App
– a set of app versions that all perform roughly the same computation
– may have versions for different platforms
– may have different versions for one platform (GPU, non-GPU)
Abstractions
• Workunit (job)
– a collection of input files
– associated with an app (not an app version!)
– attributes• resource estimates and bounds
• latency bound
• Result (job instance)
– a collection of output files
– associated with a workunit
Anatomy of a BOINC project
MySQLdatabase
project root/bin/cgi-bin/download/
00/ .. 3ff/html/log_*/templates/upload/
00/ .. 3ff/
serversdaemons andperiodic tasks
clients
Work generator
MySQLdatabase
project root/bin/cgi-bin/download/
00/ .. 3ff/html/log_*/templates/upload/
00/ .. 3ff/
work generator
• Creates input files
• Creates workunits
• One per app
• Flow control
– disk space
– DB size
Specifying a job
• Workunit template
– XML document describing• input files (logical, physical names)
• job attributes
• Result template
– XML document describing output files
• create_work()
– specifies templates, app, input files
Validator
• Check result validity
• Compare replicas
• May be app-specific
MySQLdatabase
project root/bin/cgi-bin/download/
00/ .. 3ff/html/log_*/templates/upload/
00/ .. 3ff/
validator
Validation
• Clients may
– return bad results
– exaggerated claimed credit
• Strategies
– app-specific consistency checking
– replication• fuzzy comparison
• homogeneous redundancy
– adaptive replication
Assimilator
• Processes completed results
• App-specificMySQL
database
project root/bin/cgi-bin/download/
00/ .. 3ff/html/log_*/templates/upload/
00/ .. 3ff/
assimilator
Summary
• Create app, app versions for different platforms
• Develop work generator
• Develop validator
• Develop assimilator
Isn’t there a simpler way?
Single-job submission
• Assemble your input files and executable, thenboinc_submit --input foo --output blah program
• How this works:
– uses “wrapper” app
– executable is part of workunit
– templates are created automatically
• What it doesn’t do:
– multi-platform
– validation
Job dispatch
MySQLdatabase
project root/bin/cgi-bin/download/
00/ .. 3ff/html/log_*/templates/upload/
00/ .. 3ff/
transitioner
feeder
scheduler(CGI or FastCGI)
share-memoryjob cache
clients
File transfer
project root/bin/cgi-bin/download/
00/ .. 3ff/html/log_*/templates/upload/
00/ .. 3ff/
Apache
file uploadhandler
clients
Janitorial daemons
MySQLdatabase
project root/bin/cgi-bin/download/
00/ .. 3ff/html/log_*/templates/upload/
00/ .. 3ff/
file deleter
DB purger
Ways to deploy a BOINC server
• Linux server
• Server VM for VMWare
• Server VM for Amazon EC2
The BOINC runtime system
Directory structure:
BOINC/projects/
lhcathome/physical_name0physical_name1
setiathome/slots/
0/logical_name0 (link file)logical_name1
1/
BOINC client
application
share-memorymessage-passing
BOINC runtime
fraction doneCPU time
suspendresumequit
Basic API
• boinc_init()
– creates a thread that handles messages
• boinc_finish()
– creates a “finish file”
• boinc_resolve_filename()
– maps logical to physical file names
Checkpointing
• boinc_time_to_checkpoint()
– call at points where you can checkpoint
• boinc_checkpoint_done()
– call when you’re finished checkpointing
Compound applications
• Examples:
– coordinator program runs several worker programs in sequence
– “switcher” program probes CPU architecture, selects which executable to run
• Variants of boinc_init() let you specify which app is main program, and how messages are handled
• Each message type must be handled by 1 process
Long-running applications
• Trickle-up messages
• Trickle-down messages
• Intermediate file transfers
Legacy applications
• The BOINC wrapper
– takes XML “job file”
– handles all messages
GPU and multithread apps
• Server
– you supply a function that takes an app version and a host, and returns resource usage and estimated FLOPS
– the BOINC scheduler chooses the best version
• Client
– senses and reports coprocessors (e.g. NVIDIA GPUs)
– coprocessor-aware scheduling and work fetch