Charlotte: Metacomputing on the Web

15
PDCS96 10–11 December 1996 Charlotte: Metacomputing on the Web Arash Baratloo Mehmet Karaul Zvi Kedem Peter Wyckoff New York University

description

Charlotte: Metacomputing on the Web Arash Baratloo Mehmet Karaul Zvi Kedem Peter Wyckoff New York University. Roadmap. Goals Virtual machine model Code sample Execution environment Distributed shared memory Experiments Summary. Goals. Programmer’s goals - PowerPoint PPT Presentation

Transcript of Charlotte: Metacomputing on the Web

Page 1: Charlotte: Metacomputing on the Web

PDCS96 10–11 December 1996

Charlotte:Metacomputing on the Web

Arash Baratloo Mehmet KaraulZvi Kedem Peter Wyckoff

New York University

Page 2: Charlotte: Metacomputing on the Web

PDCS96 10–11 December 1996

Roadmap

Goals Virtual machine model Code sample Execution environment Distributed shared memory Experiments Summary

Page 3: Charlotte: Metacomputing on the Web

PDCS96 10–11 December 1996

Goals

Programmer’s goals� High level language

� Reliable and predictable virtual machine (fault tolerance, heterogeneity of machine types and speeds, transiently available machines)

� Portability

User’s goals� Utilize any machine on the Web (no account or shared file system)

� Reliable and predictable virtual machine

� Authentication of results

CPU donator’s goals� Protection from malicious code

� Full control of her resources

� No administrative hassles

Page 4: Charlotte: Metacomputing on the Web

PDCS96 10–11 December 1996

Leveraging Java

Predictable and reliable virtual machine on top of the Java virtual machine

Java-capable browser widely available Emerging standard Security Heterogeneity Portability Compilers to appear in the near future

Page 5: Charlotte: Metacomputing on the Web

PDCS96 10–11 December 1996

Virtual Machine Model

Separation of programming and execution environment� Programmer develops applications for a perfect virtual machine

� Slow and fast machines are handled transparently by the runtime system

� Transiently available machines handled transparently by the runtime system

� Fault tolerance handled transparently by the runtime system

High level programming model� Unbounded number of parallel routines

� Java plus three simple language constructs

Distributed shared memory Simple memory semantics

Page 6: Charlotte: Metacomputing on the Web

PDCS96 10–11 December 1996

Example: Matrix Multiplication

public class MatrixMult extends Droutine { public static int Size = 500; public Dfloat a[][] = new Dfloat[Size]

[Size]; public Dfloat b[][] = new Dfloat[Size]

[Size]; public Dfloat c[][] = new Dfloat[Size]

[Size];

public void drun(int numTasks, int id) { int sum; for(int i=0; i<Size; i++) { sum = 0; for(int j=0; j<Size; j++) sum += a[id][j].get() * b[j]

[i].get(); c[id][i].set(sum); } }

public void run() { InitMatrix(a); InitMatrix(b); parBegin(); addDroutine(this, Size); parEnd(); PrintMatrix(c); }}

Sample Charlotte Program

Page 7: Charlotte: Metacomputing on the Web

PDCS96 10–11 December 1996

Execution Environment

The same Charlotte program runs on:

a single machine

multiple machines

(one user machine and a set of

potential volunteer machines)

Interaction among machines solely through Java-capable browsers

W W W

Page 8: Charlotte: Metacomputing on the Web

PDCS96 10–11 December 1996

Eager Scheduling

Difficulties in a distributed system� Detection of crashed-failed machines

� Detection of slow machines

Solution: Eager scheduling� Volunteer machines contact user machine for work

� Routines may be assigned to multiple machines

Difficulties with eager scheduling� Inconsistent memory views across routines and different

executions of the same routine

� Violation of exactly-once semantics

Solution: Two-phase Idempotent Execution Strategy (TIES)

Page 9: Charlotte: Metacomputing on the Web

PDCS96 10–11 December 1996

DSM

Why DSM?� Easy to use

� Programmer and user transparent

Design objectives� Heterogeneity

� Operating system independence

� Compiler independence

These require an object-based approach for implementing DSM

Page 10: Charlotte: Metacomputing on the Web

PDCS96 10–11 December 1996

DSM — Implementation

Realized at the object level

All objects have a unique identifier

Identifiers are identically mapped to objects across machines

Data is transferred on demand

Granularity can be controlled

False sharing avoided

ID

Dirty?

Value

ID

Dirty?

Value

Shared Data

Local Data

Local Data

Memory ofthe user'smachine

Memory ofa volunteer

machine

Shared Data

Local Data

Local Data

Page 11: Charlotte: Metacomputing on the Web

PDCS96 10–11 December 1996

Experiments

10 Sun SPARC 5 workstation

10 MBit/s Ethernet

Application: Ising model

Measured time is wall-clock time

Three tests� Scalability

� Load balancing

� Transiently available machines

Page 12: Charlotte: Metacomputing on the Web

PDCS96 10–11 December 1996

Experiment 1: Scalability

0

100

200

300

400

500

600

700

800

900

1000

1100

1200

1300

S 1 2 3 4 5 6 7 8 9 10

0

1

2

3

4

5

6

7

8

9

10

Time Equivalent Machines Speedup

Page 13: Charlotte: Metacomputing on the Web

PDCS96 10–11 December 1996

Experiment 2: Load balancing

0

100

200

300

5F+0H 4F+2H 3F+4H 2F+6H 1F+8H 0F+10H

0

1

2

3

4

5

Time Equivalent Machines Speedup

Page 14: Charlotte: Metacomputing on the Web

PDCS96 10–11 December 1996

Experiment 3:Transient Availability

Five machines used

After 100 seconds: 1 machine crashed and 1 added

After another 100 seconds: 2 machines crashed and 2 added

90.18 % efficiency as opposed to 5 reliable machines

86.25 % efficiency as opposed to sequential execution (95.64 % for 5 reliable machines)

Page 15: Charlotte: Metacomputing on the Web

PDCS96 10–11 December 1996

Summary

Charlotte targets the Web

Leverages benefits of Java (security, heterogeneity, widely available, ...)

Seamlessly crosses administration boundaries

Distribution of program and data

DSM with no compiler or OS support