BOINC: An Open Platform for Public-Resource Computing
description
Transcript of BOINC: An Open Platform for Public-Resource Computing
BOINC: An Open Platform BOINC: An Open Platform for Public-Resource for Public-Resource
ComputingComputingDavid P. AndersonDavid P. Anderson
Space Sciences LaboratorySpace Sciences Laboratory
U.C. BerkeleyU.C. Berkeley
Public-resource computing
Home PCsbusiness
academic
Advantages:• scale• free• growth• public education• no policy issues
Challenges:• low BW at client• costly BW at server• firewall/NAT issues• sporadic connection• untrustworthy, insecure clients• server security• heterogeneity• need PR, glitzy GUI
your computers
Why share an infrastructure?
Research lab X
University Y Public project Z
projects
applications
resource pool
• Participants install one program, select projects, specify constraints; all else is automatic• Projects are autonomous• Advantages of a shared platform:
• Better long-term resource utilization• Better instantaneous resource utilization• Faster/cheaper for projects, software is better• Easier for projects to get participants• Participants learn more
Goals of BOINC(Berkeley Open Infrastructure for Network
Computing)
• Public-resource computing/storage• Multi-project, multi-application
– Participants can apportion resources
• Handle fairly diverse applications• Work with legacy apps• Support many participant platforms• Small, simple
General structure of BOINC
• Project:
• Participant:
Scheduling server (C++)
BOINC DB(MySQL) Work
generation
data server (HTTP)
App agentApp agent
App agent
data server (HTTP)data server
(HTTP)
Web interfaces
(PHP)
Core agent (C++)
Project back end
Retry generation
Result validation
Result processing
Garbage collection
Data model• File attributes:
– Name– URL list– Persistent flag– Upload-when-present flag
• Files may originate in client or in project work manager
• Projects can use participant disks for long-term data archival
Computing model
• Applications, platforms, app versions
• Workunits– Inputs to a computation– Estimates of resource requirements
• Results– Outputs of a computation
Hosts and scheduling
• Host measurements– CPU performance (integer/FP/memory)– RAM, cache, disk free/total– On/connected statistics– Network bandwidth statistics
• Workunit properties– RAM/disk/computation requirements
• Scheduling policy– feasibility– High/low water mark
Accounting and result validation
• Standardized unit of credit– CPU time * (int+FP+mem)– Project-specific benchmark?
• Result validation– Compare redundant results, flag incorrect
results
• Granted credit:– Minimum of claimed credit among correct
results
Participant preferences
• Examples:– Work only while user away– Confirm before connecting– Don’t work if on batteries– High, low water marks– Limits on disk space, bandwidth– Application-specific preferences– List of projects + authenticators + % allocation
• Edited via Web interface
Application Programming
• Checkpoint/restart• Filename translation• Graphics
– OpenGL-based– Application window or screensaver
Conclusion
• BOINC status– Mostly feature-complete– Client runs on Linux, Solaris, Windows, MacOS
X– Small: client is 5,000 lines, server 2,000
• Projects:– Astropulse (later this year)– Other SETI@home (Parkes etc.)– Folding@home, climate prediction– Others: rendering? Theorem proving?