Volunteer Computing using BOINC
-
Upload
faiveley-transport -
Category
Software
-
view
411 -
download
3
Transcript of Volunteer Computing using BOINC
Cover from Linux magazine, BOINC
Volunteer computing using
Berkeley Open Infrastructure for Network Computing
http://bit.ly/boinc_srbiauTo get more references visit:
Pooyan Mehrparvar
i. A brief introduction to volunteer computing & BOINC
ii. BOINC’s applicationsiii. BOINC’s architectureiv. How to join a popular BOINC projectv. How to set-up your own BOINC project
This prensentation covers:
I – A brief Introduction to volunteer computing & BOINC
Why do we choose this topic? What is volunteer computing? What is BOINC?
What is volunteer computing
An established technology that enables ordinary citizens donate their computing resources to one or more "projects".
BOINC is the most widely-used middleware system
a client program runs on the volunteer's computer
BOINC was introduced by David P. Anderson
BOINC is designed to support applications thathave large computation requirements, storagerequirements, or both. The main requirement ofthe application is that it be divisible into a largenumber (thousands or millions) of jobs that canbe done independently.
Goal:Use all the computers in the world, allthe time, to do worthwhile things
What is BOINC
Basic overview of BOINC jobs
Security: BOINC uses code signing to prevent distribution of malicious executables. Each project has a key pair for code signing and the private key kept on network-isolated machine
Virtualization as a solution for client’s security
Project: • An entity that does distributed computing using BOINC. Application: • A project can include multiple applications• an application includes several programs (for different platforms) and a set of
workunits and results
Application versions: • A particular application program version,compiled for a particular platform
Workunit: a computation to be performed. • associated with an application, not with an application version
Result: • an instance of a computation, either unstarted, in progress, or completed. • Each result is associated with a workunit.
Basic concepts of BOINC:
When BOINC operates? Cycle scavenging BOINC uses computers’s idle cycles of CPU/GPU and other resources to
operate (by default) To avoid high battery consumption/3G charge the BOINC android app runs on
A/C power & WiFi connection (by default)
Client can run BOINC as:Screensaver (with fancy graphics)Window service (running in the background)Application (displaying results in tabular form)
Biref history of BOINC SETI@home development (1998) Ifrastructure issues, United devices (2000) Uited devices falling out, SETI@home failed, hacked(2001) BOINC was introduced (2002) Climateprediction.net, SETI@home, LHC@home were implemented on BOINC (2004) Rosseta@home, Einstein@home, IBM World community Grid, Primegride (2005) BOINCstats (BAM!), Gridrepublic, BOINC wrapper (2006) GPU support, Multi-core apps (2008) BOINC packages for debian (2010) Apps in virtual machines, vbox wrapper (Server-side) (2011) Android, Condor/OSG collaboration, Git (2012) Virtualbox client (2013) Samsung app (Power sleep), HTC app (Power to give) (2014)
BOINC supported platforms (until december 2014) Server: Unix-based operating sytems (Debian based linux is recommended) Microsoft Windows Vista or later (POSIX ready) Client:
Linux (x86/x64) Microsoft windows (x86/x64) MacOS (x86/x64) Playstation3 Android
Virtualization as a solution for cross-platform distributed systems (Virtualbox, Vmware, VirtualPC,...)
Performance Benchmarking: FLOPS as a measure for computer performance in scientific fields (Floating point operations per second) Some computer systems are unable to to run FLOPS benchamrk MIPS/MOPS: suitable for database query, word processing,
spreadsheets, or to run multiple virtual operating systems
Benchmarking records : Fastest supercomputer: China's Tianhe-2 running 33.86 petaflops (June 10, 2013)
BOINC: Active: 232,691 volunteers, 718,577 computers. 24-hour average: 8.308 petaFLOPS (December 19, 2014)
Why to use GPU & Playstation General-purpose computing on graphics processing units (GPGPU) Playstation 4 (ATI) :
18 compute units, 64 cores per unit = 1,152 coresTheorical peak performance = 1.84 TFLOPS
Playstation 3’s CUDA (NVIDIA) is widley used in BOINC projects e.g Folding@home
Computational Science : rosseta@home
Virtual campus supercomputing : univ. of Westminster in London
Desktop grids for business : Slicify project
Integration with HTCondor to allow Globus-based grids to run jobs for BOINC projects : Einstein@OSG
II – BOINC’s aplications
What is computational Science?
What is computational Science?
Computational science is concerned with constructing mathematical models and quantitative analysis techniques
and using computers to analyze and solve scientific problems.
BOINC is used in:Physics, Astrophysics MathematicsBiology and medicince (Protein folding)Distributed sensingClimate modelingGames, 3D animation Rendering,...
BOINC popular projects:Physics, Astrophysics: LHC@home, Einstein@home, SETI@homeMathematics: SZTAKI Desktop Grid, PrimegrideBiology and medicince: Folding@home, Rosseta@homeDistributed sensing: Quake Catcher NetworkClimate modeling: Climateprediction.netGames, 3D animation Rendering,... : Chess@home,
Enigma@home
Find more projects on:http://boinc.berkeley.edu/projects.php
LHC@homeSETI@home
What is protein fodling?
Amino Acids, Proteins, DNA Protein structure prediction / Modeling proteins
structure Rosseta@home helps to find a cure for:
Alzheimer’s diseaseHIVMalariaAnthraxHerpes simplex virus 1
Rosetta@home’s screensaver
Crowdsourcing Crowdsourcing is the process of getting work or funding,
usually online, from a crowd of people. Human vs Computer
Foldit : Solve puzzles for science! From the creators of Rosseta@home (Washington Univ.) Similar to Rosetta@home, Foldit is aimed as a means of
discovering native protein structures faster, through a combination of crowdsourcing and distributed computing.
Game with a purpose In 2011 gamers helped to decipher the crystal structure of
M-PMV (virus causing AIDS in monkeys)
Snapshot of Foldit game
III – BOINC’s architecture
Challenges while using a BOINC system:
How can we depend on clients? Are they permanent?Will they reach the deadlines? Is the produced result valid?How can we trust a BOINC server?How can we trust a BOINC client?How to send jobs to various platforms?
Some features of a BOINC system:
homogeneous redundancy (sending workunits only to computers of the same platform—e.g.: Win XP SP2 only.)
workunit trickling (sending information to the server before the workunit completes)
locality scheduling (sending workunits to computers that already have the necessary files and creating work on demand)
work distribution based on host parameters (workunits requiring 512 MB of RAM, for example, will only be sent to hosts having at least that much RAM, We send more jobs to multi-core CPUs/GPUs)
Double checking: (server sends the same workunit to at least two clients, then it compares the result by validation techniques (bitwise, sample trivial, fuzzy) or a customized validation technique
Job scheduling is needed in both server & client: priority between different tasks, reaching deadlines
To avoid cheating, credits are given when after the job is validated by the server
BOINC server DAEMONS:In multitasking computer systems, a daemon is a computer program that runs as a background process, rather than being under the direct control of an interactive user
Generator Transitioner Feeder Scheduler Validator Assimilator File deleter
BOINC server DAEMONS:
IV – How to join a popular BOINC project
Regiter at Boincstats (BAM) Choose your project(s) Join/create a team (optional) Download the BOINC client (BOINC manager) Log into software Immediately the project(s) will be attached Tasks will be downloaded Completed tasks will be uploaded to the server You will gain due to your performance New job will be downloaded
Step by step manual is available at: http://bit.ly/boinc_srbiau
BOINC client manager running four tasks (three projects)
V – How to set-up your own BOINC project
Project configuration in a nutshell!:
Configure a LAMP (Linux, Apache server, Mysql, Php/Phyton) A virtualbox debian linux is available at BOINC’s website
Develope your own project (mostly done in C++ (GCC), Fortran)
Attach your project url in boincmanager (Client) “your server ip/projectname” (exp: http://192.168.1.80/testproject)
Monitor your project perfomance/administaration at “your server ip/projectname_ops” (exp: http://192.168.1.80/testproject_ops)
Step by step video is available at: http://bit.ly/boinc_srbiau
BOINC server (left) + BOINC manager client (right) using virualization
Conceived to be used by scientists, not IT professionals
BOINC offers tools for◦ Creating, starting, stopping and querying projects◦ Adding new applications, new platforms, …◦ Creating workunits ◦ Monitoring server performance
(All these procedures could be done by UNIX shell commands directly)
Server administration’s page:
What you probably need to implement a BOINC ecosystem A precise evaluation before choosing BOINC as a solution, comparing
to other alternatives (JPPF, other cloud or distributed systems) Fundamental undrestaning of BOINC architecture including it’s
daemons, repositories, shell-script commands, etc Good knowledge and experience of linux environment (shell) C/C++, Fortran programming skills (GCC, VC++, etc) Server/network configuration Security issues
BOINC’s weakness: The architecture is centeriliezed in contrast of most Grid systems Insufficient Interest from Computer Science Insufficient Interest from scientists Insufficient Interest from funding agencies According to official reports volunteers are not increasing Complexity of server and job submission
Insufficient documentations (obsolete docs) Rare experiment in web society (You can’t find your answers in BOINC
forums, Stack-overflow, etc), Risk of failure