Volunteer Computing with BOINCkorobkin/tmp/SC10/tutorials/docs/S14/S14.pdf · The BOINC data mode l...

49
Volunteer Computing with BOINC Dr. David P. Anderson University of California, Berkeley SC10 Nov. 14, 2010

Transcript of Volunteer Computing with BOINCkorobkin/tmp/SC10/tutorials/docs/S14/S14.pdf · The BOINC data mode l...

Page 1: Volunteer Computing with BOINCkorobkin/tmp/SC10/tutorials/docs/S14/S14.pdf · The BOINC data mode l App versions, job inputs, job output can consist of arbitrarily many files Each

Volunteer Computing with BOINC

Dr. David P. Anderson University of California, Berkeley

SC10 Nov. 14, 2010

Page 2: Volunteer Computing with BOINCkorobkin/tmp/SC10/tutorials/docs/S14/S14.pdf · The BOINC data mode l App versions, job inputs, job output can consist of arbitrarily many files Each

Goals

  Explain volunteer computing   Teach how to create a volunteer computing

project using BOINC

Target audience:   High-throughput computing users   Technical skills:

  Basic Linux/Apache sysadmin, familiarity with PHP, SQL and XML, C/C++ (optional)

Page 3: Volunteer Computing with BOINCkorobkin/tmp/SC10/tutorials/docs/S14/S14.pdf · The BOINC data mode l App versions, job inputs, job output can consist of arbitrarily many files Each

Outline   Why use volunteer computing?   Basic concepts of BOINC   Developing BOINC applications (15 minute break)   Deploying a BOINC server   Deploying applications   Submitting jobs   Organizational issues

Page 4: Volunteer Computing with BOINCkorobkin/tmp/SC10/tutorials/docs/S14/S14.pdf · The BOINC data mode l App versions, job inputs, job output can consist of arbitrarily many files Each

Part 1:

Why use volunteer computing?

Page 5: Volunteer Computing with BOINCkorobkin/tmp/SC10/tutorials/docs/S14/S14.pdf · The BOINC data mode l App versions, job inputs, job output can consist of arbitrarily many files Each

The Consumer Digital Infrastructure

  1 billion PCs   current GPUs: 1 TeraFLOPS (1,000 ExaFLOPS

total)   Storage: ~1,000 Exabytes

  Commodity Internet: 10-1,000 Mbps to home   Consumers pay for

  hardware   sysadmin   network costs   electricity

Page 6: Volunteer Computing with BOINCkorobkin/tmp/SC10/tutorials/docs/S14/S14.pdf · The BOINC data mode l App versions, job inputs, job output can consist of arbitrarily many files Each

Volunteer computing

  PC owners donate computing resources to projects (e.g., computational science)

  Applications run   at zero priority while PC in use, and/or   while PC is not in use

Page 7: Volunteer Computing with BOINCkorobkin/tmp/SC10/tutorials/docs/S14/S14.pdf · The BOINC data mode l App versions, job inputs, job output can consist of arbitrarily many files Each

Examples Project start where area peak #hosts

GIMPS 1994 math 10,000 distributed.net 1995 cryptography 100,000 SETI@home I 1999 UCB SETI 600,000 Folding@home 1999 Stanford biology 200,000 United Devices 2002 commercial biomedicine 200,000 CPDN 2003 Oxford climate change 150,000 LHC@home 2004 CERN physics 60,000 Predictor@home 2004 Scripps biology 100,000 WCG 2004 commercial biomedicine 200,000 Einstein@home 2005 LIGO astrophysics 200,000 SETI@home II 2005 UCB SETI 850,000 Rosetta@home 2005 U. Wash biology 100,000 SIMAP 2005 T.U. Munich bioinformatics 10,000 ... ... ... ... ...

Page 8: Volunteer Computing with BOINCkorobkin/tmp/SC10/tutorials/docs/S14/S14.pdf · The BOINC data mode l App versions, job inputs, job output can consist of arbitrarily many files Each

Current status

  ~50 projects   500,000 vounteers   800,000 computers

Page 9: Volunteer Computing with BOINCkorobkin/tmp/SC10/tutorials/docs/S14/S14.pdf · The BOINC data mode l App versions, job inputs, job output can consist of arbitrarily many files Each

High-throughput computing

High-performance computing

cluster (MPI)

supercomputer

cluster (batch)

Grid

Commercial cloud

Volunteer computing

single job

# processors

multiple jobs

10K-1M

1000

100

1

Page 10: Volunteer Computing with BOINCkorobkin/tmp/SC10/tutorials/docs/S14/S14.pdf · The BOINC data mode l App versions, job inputs, job output can consist of arbitrarily many files Each

Volunteer computing is different

  You don’t buy resources; you ask for them   Resources are:

-  heterogeneous -  sporadically available and connected -  untrusted and not private -  behind firewalls/NATs/proxies

Page 11: Volunteer Computing with BOINCkorobkin/tmp/SC10/tutorials/docs/S14/S14.pdf · The BOINC data mode l App versions, job inputs, job output can consist of arbitrarily many files Each

Part 2:

Basic concepts of BOINC

Page 12: Volunteer Computing with BOINCkorobkin/tmp/SC10/tutorials/docs/S14/S14.pdf · The BOINC data mode l App versions, job inputs, job output can consist of arbitrarily many files Each

About BOINC

  Funded by NSF since 2002   Open-source (LGPL)   Based at UC Berkeley   Few staff, but lots of volunteers

  software testing   translation   documentation   support (email lists, message boards, Skype)

Page 13: Volunteer Computing with BOINCkorobkin/tmp/SC10/tutorials/docs/S14/S14.pdf · The BOINC data mode l App versions, job inputs, job output can consist of arbitrarily many files Each

Volunteers and projects

volunteers projects

CPDN

LHC@home

WCG attachments

Page 14: Volunteer Computing with BOINCkorobkin/tmp/SC10/tutorials/docs/S14/S14.pdf · The BOINC data mode l App versions, job inputs, job output can consist of arbitrarily many files Each

BOINC software overview

client

apps

screensaver

GUI

scheduler MySQL

data server

daemons

volunteer host

project server HTTP

Page 15: Volunteer Computing with BOINCkorobkin/tmp/SC10/tutorials/docs/S14/S14.pdf · The BOINC data mode l App versions, job inputs, job output can consist of arbitrarily many files Each

BOINC scheduler applications

Win32 + NVIDIA

Win64

Mac OS X

app versions

jobs

instances

Win32 N-core

Win32 - HW, SW description - existing workload - per resource type: # of instances requested # of seconds requested

- app version descriptions - job descriptions

Page 16: Volunteer Computing with BOINCkorobkin/tmp/SC10/tutorials/docs/S14/S14.pdf · The BOINC data mode l App versions, job inputs, job output can consist of arbitrarily many files Each

Job replication

  Job instances may fail or return wrong results   Job replication: do 2, see if they agree

-  “agree” may be fuzzy

  Homogeneous replication -  numerical equivalence of hosts

  Adaptive replication -  reduce replication for hosts that seem

trustworthy

Page 17: Volunteer Computing with BOINCkorobkin/tmp/SC10/tutorials/docs/S14/S14.pdf · The BOINC data mode l App versions, job inputs, job output can consist of arbitrarily many files Each

The job pipeline

work generator

BOINC

validator

assimilator

Page 18: Volunteer Computing with BOINCkorobkin/tmp/SC10/tutorials/docs/S14/S14.pdf · The BOINC data mode l App versions, job inputs, job output can consist of arbitrarily many files Each

The BOINC data model

  App versions, job inputs, job output can consist of arbitrarily many files

  Each file has a physical name (unique, immutable); each reference to a file has a “logical name”

  Files have various attributes (e.g., sticky)   Each file can have one or more URLs, and are

transferred via HTTP   App version files are digitally signed

Page 19: Volunteer Computing with BOINCkorobkin/tmp/SC10/tutorials/docs/S14/S14.pdf · The BOINC data mode l App versions, job inputs, job output can consist of arbitrarily many files Each

What kinds of jobs can BOINC handle?

  Pretty much anything you’d run on a Grid   Bag of tasks (but IPC support soon)   Short/long jobs   Data intensive, up to a point   Geared towards

-  Few apps, many jobs (high startup cost per app)

-  Jobs with high slack time

Page 20: Volunteer Computing with BOINCkorobkin/tmp/SC10/tutorials/docs/S14/S14.pdf · The BOINC data mode l App versions, job inputs, job output can consist of arbitrarily many files Each

Part 3:

Application development for BOINC

Page 21: Volunteer Computing with BOINCkorobkin/tmp/SC10/tutorials/docs/S14/S14.pdf · The BOINC data mode l App versions, job inputs, job output can consist of arbitrarily many files Each

The BOINC runtime environment processes

files

Page 22: Volunteer Computing with BOINCkorobkin/tmp/SC10/tutorials/docs/S14/S14.pdf · The BOINC data mode l App versions, job inputs, job output can consist of arbitrarily many files Each

Native BOINC applications

  boinc_init() -  create runtime system thread

  boinc_finish() -  write finish file

  boinc_resolve_filename(logical, physical)   boinc_fraction_done(x)

Page 23: Volunteer Computing with BOINCkorobkin/tmp/SC10/tutorials/docs/S14/S14.pdf · The BOINC data mode l App versions, job inputs, job output can consist of arbitrarily many files Each

Checkpointing

  bool boinc_time_to_checkpoint() -  call when in checkpointable state

  boinc_checkpoint_done()

Page 24: Volunteer Computing with BOINCkorobkin/tmp/SC10/tutorials/docs/S14/S14.pdf · The BOINC data mode l App versions, job inputs, job output can consist of arbitrarily many files Each

The BOINC wrapper

  Can use for legacy apps   XML input file lists sub-jobs

-  executable, input files

  What it does: -  interfaces to BOINC client -  copies files to/from slot directory -  runs executables -  does checkpointing at sub-job level

Page 25: Volunteer Computing with BOINCkorobkin/tmp/SC10/tutorials/docs/S14/S14.pdf · The BOINC data mode l App versions, job inputs, job output can consist of arbitrarily many files Each

Building app versions

  Linux -  gcc

  Windows -  Visual Studio -  minGW (gcc)

  Mac OS X -  xcode

Page 26: Volunteer Computing with BOINCkorobkin/tmp/SC10/tutorials/docs/S14/S14.pdf · The BOINC data mode l App versions, job inputs, job output can consist of arbitrarily many files Each

Multithread apps

  boinc_init_parallel()   Allows suspend/resume of all threads

-  Unix: fork/exec -  Windows: direct thread control

Page 27: Volunteer Computing with BOINCkorobkin/tmp/SC10/tutorials/docs/S14/S14.pdf · The BOINC data mode l App versions, job inputs, job output can consist of arbitrarily many files Each

GPU app versions

  Develop for NVIDIA or ATI, with CUDA, CAL, OpenCL, etc. (BOINC supplies samples)

  Each version has a “plan class”   For each plan class, supply a function that

determines -  can app run on this host?

  hardware, driver version, etc. -  what resources will it use?

  #CPUs, #GPUs, GPU RAM, etc.

Page 28: Volunteer Computing with BOINCkorobkin/tmp/SC10/tutorials/docs/S14/S14.pdf · The BOINC data mode l App versions, job inputs, job output can consist of arbitrarily many files Each

VM apps

  Develop apps on your favorite OS   Create a VirtualBox VM image   App version consists of

-  VM wrapper (supplied by BOINC) -  VM image -  app executable

Page 29: Volunteer Computing with BOINCkorobkin/tmp/SC10/tutorials/docs/S14/S14.pdf · The BOINC data mode l App versions, job inputs, job output can consist of arbitrarily many files Each

Part 4:

Deploying a BOINC server

Page 30: Volunteer Computing with BOINCkorobkin/tmp/SC10/tutorials/docs/S14/S14.pdf · The BOINC data mode l App versions, job inputs, job output can consist of arbitrarily many files Each

Hardware options

  Native Linux host -  download/compile BOINC software

  BOINC server VM (VMware/Debian)   BOINC Amazon EC2 image

Page 31: Volunteer Computing with BOINCkorobkin/tmp/SC10/tutorials/docs/S14/S14.pdf · The BOINC data mode l App versions, job inputs, job output can consist of arbitrarily many files Each

Components of a project

  Master URL   name   MySQL database   Directory hierarchy   A set of daemon processes and cron jobs

Page 32: Volunteer Computing with BOINCkorobkin/tmp/SC10/tutorials/docs/S14/S14.pdf · The BOINC data mode l App versions, job inputs, job output can consist of arbitrarily many files Each

Processes

work generator

validator

assimilator feeder

MySQL DB

scheduler

transitioner

file deleter

DB purger

clients

Page 33: Volunteer Computing with BOINCkorobkin/tmp/SC10/tutorials/docs/S14/S14.pdf · The BOINC data mode l App versions, job inputs, job output can consist of arbitrarily many files Each

Project directory hierarchy apps/ application files

bin/ daemon programs

cgi-bin/ BOINC scheduler and upload GCI

config.xml configuration file

download/ downloadable files

html/ web site; master URL points here

keys/ keys for code signing, upload auth

log_(hostname) daemon log files

project.xml list of platforms and apps

upload/ uploaded files

Page 34: Volunteer Computing with BOINCkorobkin/tmp/SC10/tutorials/docs/S14/S14.pdf · The BOINC data mode l App versions, job inputs, job output can consist of arbitrarily many files Each

BOINC database platform app app_version user host workunit result ...

Page 35: Volunteer Computing with BOINCkorobkin/tmp/SC10/tutorials/docs/S14/S14.pdf · The BOINC data mode l App versions, job inputs, job output can consist of arbitrarily many files Each

Creating a project

make_project name

  creates -  directory hierarchy -  DB -  mods for httpd.conf -  crontab entry

Page 36: Volunteer Computing with BOINCkorobkin/tmp/SC10/tutorials/docs/S14/S14.pdf · The BOINC data mode l App versions, job inputs, job output can consist of arbitrarily many files Each

Project configuration and control

  config.xml -  scheduling and other options -  list of daemons -  list of periodic tasks

  project control -  bin/start: start daemons, enable scheduler -  bin/stop: stop daemons, disable scheduler -  bin/status

Page 37: Volunteer Computing with BOINCkorobkin/tmp/SC10/tutorials/docs/S14/S14.pdf · The BOINC data mode l App versions, job inputs, job output can consist of arbitrarily many files Each

Scaling a BOINC server

  Components can run on different machines sharing a file system

  Each component can be distributed   MySQL server is typically the bottleneck   1 server machine can issue ~100K jobs/day; 4

machines can issue > 1 million

Page 38: Volunteer Computing with BOINCkorobkin/tmp/SC10/tutorials/docs/S14/S14.pdf · The BOINC data mode l App versions, job inputs, job output can consist of arbitrarily many files Each

Part 5:

Deploying applications

Page 39: Volunteer Computing with BOINCkorobkin/tmp/SC10/tutorials/docs/S14/S14.pdf · The BOINC data mode l App versions, job inputs, job output can consist of arbitrarily many files Each

Adding an application

  edit project.xml

  run bin/xadd

<app> <name>multi_thread</name> <user_friendly_name>Test multi-thread apps</user_friendly_name> </app>

Page 40: Volunteer Computing with BOINCkorobkin/tmp/SC10/tutorials/docs/S14/S14.pdf · The BOINC data mode l App versions, job inputs, job output can consist of arbitrarily many files Each

Adding an application version

  Create application version directory

  Sign files on offline computer   run bin/update_versions

apps/ uppercase/ uppercase_6.14_windows_intelx86__cuda.exe/ uppercase_6.14_windows_intelx86__cuda.exe graphics_app=uppercase_graphics_6.14_windows_intelx86.exe logo.jpg Helvetica.txf

Page 41: Volunteer Computing with BOINCkorobkin/tmp/SC10/tutorials/docs/S14/S14.pdf · The BOINC data mode l App versions, job inputs, job output can consist of arbitrarily many files Each

Part 6:

Submitting jobs

Page 42: Volunteer Computing with BOINCkorobkin/tmp/SC10/tutorials/docs/S14/S14.pdf · The BOINC data mode l App versions, job inputs, job output can consist of arbitrarily many files Each

Describing job inputs   Input template file <file_info> <number>0</number> </file_info> <workunit> <file_ref> <file_number>0</file_number> <open_name>in</open_name> </file_ref> <target_nresults>1</target_nresults> <min_quorum>1</min_quorum> <command_line>-cpu_time 60</command_line> <rsc_fpops_bound>446797000000000</rsc_fpops_bound> <rsc_fpops_est>279248000000000</rsc_fpops_est> </workunit>

Page 43: Volunteer Computing with BOINCkorobkin/tmp/SC10/tutorials/docs/S14/S14.pdf · The BOINC data mode l App versions, job inputs, job output can consist of arbitrarily many files Each

Describing job outputs   Output template file

<file_info> <name><OUTFILE_0/></name> <generated_locally/> <upload_when_present/> <max_nbytes>5000000</max_nbytes> <url><UPLOAD_URL/></url> </file_info> <result> <file_ref> <file_name><OUTFILE_0/></file_name> <open_name>out</open_name> </file_ref> </result>

Page 44: Volunteer Computing with BOINCkorobkin/tmp/SC10/tutorials/docs/S14/S14.pdf · The BOINC data mode l App versions, job inputs, job output can consist of arbitrarily many files Each

Submitting a job

  Stage input files

  Submit job create_work –appname A –wu_name B –wu_template C –result_template D

cp test_files/12ja04aa `bin/dir_hier_path 12ja04aa`

Page 45: Volunteer Computing with BOINCkorobkin/tmp/SC10/tutorials/docs/S14/S14.pdf · The BOINC data mode l App versions, job inputs, job output can consist of arbitrarily many files Each

Part 7:

Organizational issues

Page 46: Volunteer Computing with BOINCkorobkin/tmp/SC10/tutorials/docs/S14/S14.pdf · The BOINC data mode l App versions, job inputs, job output can consist of arbitrarily many files Each

Single-scientist projects

  Need to:   Port apps   Get publicity   interface with public   maintain servers

  Not many research groups have the resources   And it creates a lot of competing “brands”

Page 47: Volunteer Computing with BOINCkorobkin/tmp/SC10/tutorials/docs/S14/S14.pdf · The BOINC data mode l App versions, job inputs, job output can consist of arbitrarily many files Each

Umbrella projects

Example: IBM World Community Grid

Project publicity web development sysadmin app porting

Page 48: Volunteer Computing with BOINCkorobkin/tmp/SC10/tutorials/docs/S14/S14.pdf · The BOINC data mode l App versions, job inputs, job output can consist of arbitrarily many files Each

The Berkeley@home model

•  A university has –  scientists –  a powerful “brand” –  PR resources –  IT infrastructure –  lots of alumni (UCB: 500,000)

Page 49: Volunteer Computing with BOINCkorobkin/tmp/SC10/tutorials/docs/S14/S14.pdf · The BOINC data mode l App versions, job inputs, job output can consist of arbitrarily many files Each

Hubs •  nanoHUB: “science portal” for nanoscience

–  social network + “app store” –  sharing of ideas, data, software –  computational portal

•  HUBzero: generalization to other areas –  currently ~20 hubs

•  Integration of BOINC with HUBzero –  each hub has a volunteer computing project