Lower Bounds for Property Testing Luca Trevisan U.C. Berkeley.
David P. Anderson Space Sciences Lab U.C. Berkeley Exa-Scale Volunteer Computing.
-
Upload
ellen-grant -
Category
Documents
-
view
217 -
download
2
Transcript of David P. Anderson Space Sciences Lab U.C. Berkeley Exa-Scale Volunteer Computing.
David P. AndersonDavid P. AndersonSpace Sciences LabSpace Sciences Lab
U.C. BerkeleyU.C. Berkeley
Exa-ScaleExa-Scale Volunteer Computing Volunteer Computing
A brief history of volunteer computing
Applications
Platforms
1995 2005distributed.net, GIMPS
SETI@home, Folding@home
Entropia, United Devices, ...
BOINC
Climateprediction.net
Predictor@home,WCG, Einstein, Rosetta, ...
20052000 2008
Bayanihan, Javelin, ...
Applications Computational biology
protein folding and structure prediction Rosetta++ Biomedical, plant genomics
virtual drug design Autodock, CHARMM Cancer, AIDS, Alzheimer’s, Dengue fever
genetic linkage analysis phylogenetics
Epidemiology Malaria model
Environmental studies “Virtual Prairie” simulation
More applications High-energy physics
CERN: accelerator, collision simulations Climate prediction
HADSM3 (U.K.) WRF (NCAR)
Astronomy gravitational wave detection SETI Milky Way, Big Bang studies
Nanotechnology Mathematics Distributed seismography
The PetaFLOPS milestone
Folding@home: Sept 19, 2007 current average: 2.67 PetaFLOPS 40% Cell (40K Sony PS3) 40% GPU (10K NVIDIA) 20% CPU (250,000 computers)
BOINC: Jan 31, 2008 current average: 1.2 PetaFLOPS 568,000 computers; 87% Windows)
First supercomputer: May 25, 2008 IBM RoadRunner 1.026 PetaFLOPS $133M
Cost per TeraFLOPS-year
Cluster: $124,000 Amazon EC2: $1,750,000 Volunteer computing: $2,000
The real goals
Enable paradigm-shifting science change the way resources are allocated
Revive public interest in science avoid return to the Dark Ages
So we need to: make volunteer computing feasible for all scientists involve the entire public, not just the geeks solve the “project discovery” problem
Progress: non-zero but small
The road to ExaFLOPS
Consumer computing resources CPUs in PCs (desktop, laptop) GPUs in PCs Video-game consoles mobile devices home media devices
For each type what is performance potential?
how will it change over time? ease of programming? energy efficiency? network connectivity? how to publicize and deploy?
CPUs
2 billion PCs by 2015 Performance increases largely from multicore
need to develop parallel apps Availability will decline (green computing) 1 ExaFLOPS:
40,000,000 PCs x 100 GFLOPS x 0.25 availability Promotional partner: MS? HP? Dell?
GPUs
NVIDIA 8800: ~500 GFLOPS Programmability: CUDA; OpenCL? 1 ExaFLOPS:
4,000,000 x 1,000 GFLOPS x 0.25 availability
Video-game consoles
Sony Playstation 3 Cell (~100 GFLOPS) + GPU Ships with Folding@home Hard to program
Microsoft Xbox 3 PowerPC cores (~30GFLOPS) + GPU
0.25 ExaFLOPS: 10,000,000 consoles x 100 GFLOPS x 0.25
availability
Mobile devices (recharging)
Cell phones, PDAs, media players, Kindle, etc. Hardware convergence
0.5 GFLOPS CPU (Freescale i.mx37, 65 nm) low power (best FLOPS/watt)
>256MB RAM >10GB stable storage Internet access Software Google Android?
3.3 billion cell phones in 2010 0.5 ExaFLOPS:
1B x 1 GFLOPS x 0.5 availability
Home media players
Cable set-top box, Blu-Ray player Hardware: low-end PC Software environment: Java-based Multimedia
home platform (MHP) 0.1 ExaFLOPS:
100M x 2 GFLOPS x 0.5 availability
The BOINC project
NSF-funded, based at UC Berkeley 2.5 FTEs many volunteers
Functions: develop technology for volunteer and desktop grid
computing enable online communities do research related to volunteer computing
BOINC server software
Job scheduling high performance (10M jobs/day) scalability
Web code (PHP) community, social network
Ways to create a project: Set up a server on a Linux box Run BOINC server VM (VMware) Run BOINC server VM on Amazon EC2
MySQL DB(~1M jobs)
scheduler(CGI)
Clients feedershared memory
(~1K jobs)
Various daemons
BOINC client software
core clientapplication
BOINC library
GUI
screensaver
local TCP
schedulers, data servers
user preferences, control
Cross-platform (Win/Mac/Linux) Simple, configurable, secure, invisible
graphics app
BOINC library
BOINC’s project/volunteer model
Attachments
volunteer PC Projects
Independent No central authority ID: URL
Climateprediction.net
Superlink@Technion
World Comm. Grid
Rosetta@home
Facilitating project discovery
volunteer PC BOINC-based projects
Climateprediction.net
Superlink@Technion
World Comm. Grid
Rosetta@home
AccountManager
Webservices
Application
platform
Multithread and coprocessor support
client scheduler
List of platforms,Coprocessors#CPUs
jobs, app versionsapp planning
function
app versions
platform
app version
job
Inputs:host, app class
Outputs:avg/max #CPUscoprocessor usageestimated FLOPS
Adaptive replication
Volunteer PCs are anonymous and untrusted how do we know results are correct?
Replicated computing require consensus of equivalent results 2x throughput penalty
Adaptive replication maintain estimate of host “validity rate” V(h) if V(h) > K, replicate else replicate with probability V(h)/K goal: reduce throughput penalty to 1+ε
Simulators
Scheduling policies client:
when to fetch work? what project? how much? CPU scheduling
server: what jobs to send to a given client?
Problems with in situ experimentation hard to control can do a lot of damage
Simulators client simulator: 1 client, N projects server simulator (EmBA): 1 project, N clients
Volunteer-facing features
Motivators competition community
Credit cross-project statistics
Web features friend lists, private messages, message boards teams
MySpace and Facebook widgets and apps
Organizational models
Single-scientist projects: a dead-end? Campus-level meta-project: e.g. U. of Houston:
1,000 instructional PCs 5,000 faculty/staff 30,000 students 400,000 alumni
Lattice: U. Maryland Center for Bioinformatics MindModeling.org
ACT-R community (~20 universities) IBM World Community Grid
~8 applications from various institutions Extremadura (Spain)
consortium of 5-10 universities EDGeS (SZTAKI)
EGEE@home? Almere Grid: community
Distributed thinking
Stardust@home, Clickworkers, GalaxyZoo, Fold It!
What can people do better than computers?
New software initiatives Bossa: middleware for distributed thinking
job queueing and replication volunteer skill estimation
Bolt: middleware for web-based training and education
Shared infrastructure:
malicious
useless
useful
savants
BOINC
volunteercomputing
Bolt
teaching,training
Bossa
distributedthinking
BOINC Basicsaccounts, groups, credit, communication
Conclusion
Volunteer computing Some big achievements, but not close to potential Problems are organizational/political, not technical Volunteer computing + GPUs = ExaFLOPS
Distributed thinking What are the apps? What are middleware requirements?
Interested in either one? – let’s talk!