1 On-line Parallel Tomography Shava Smallen UCSD.

On-line Parallel Tomography

Shava Smallen

I) Introduction to On-line Parallel Tomography

II) Tunable On-line Parallel Tomography

III) User-directed application-level scheduler

IV) Experiments

V) Conclusion

Talk Outline

What is tomography?

• A method for reconstructing the interior of an object from its projections

• At the National Center for Microscopy and Imaging Research (NCMIR), tomography is applied to electron microscopy to study specimens at the cellular and subcellular level

Tomogram of spiny dendrite(Images courtesy of Steve Lamont)

Example

Parallel Tomography at NCMIR

• Embarrassingly parallel

specimen

scanlineprojection

projection

scanline

NCMIR Usage Scenarios

Off-line parallel tomography (off-line PT)

– Data resides somewhere on secondary storage

– Single, high quality tomogram

– Reduce turnaround time

– Previous work (HCW’ 00)

On-line parallel tomography (on-line PT)

– Data streamed from the electron microscope

• long makespan, configuration errors, etc.

– Iteratively computed tomogram

– Soft real-time execution

On-line PT

• Real-time feedback on quality of data acquisition1 ) First projection acquired from microscope2 ) Generate coarse tomogram3 ) Iteratively refine tomogram using subsequent

projections (refresh)• Update each voxel value • Size of tomogram is constant

NCMIR Target Platform

• Multi-user, heterogenous resources– NCMIR cluster

• SGI Indigo2, SGI Octane, SUN ULTRA, SUN Enterprise

• IRIX, Solaris

– Meteor cluster• Pentium III dual proc• Linux, PBS

– Blue Horizon• AIX, Loadleveler, Maui Scheduler

network

slices

preprocessor

writer

On-line PT Architecture

projection

scanlines

tomogram

On-line PT Design

1) Frame on-line parallel tomography as a tunable application– Resource limitations / dynamic– Availability of alternate configurations [Chang,et

al]• each configuration corresponds to different output

quality and resource usage

2) Coupled with user-directed application-level scheduler (AppLeS)– adaptive scheduler– promote application performance

On-line PT Configuration

• Triple: (f, r, su)

• Reduction factor (f) – Reduce resolution of data reduce both

computation and communication

• Projections per refresh (r)– Reduce refinement frequency reduce

communication

• Service Units - (su)– Increase cost of execution increase

computational power

User Preferences

• Best configuration (f, r, su) = (1, 1, 0 )

• Several possible configurations user specifies bounds– projections should be at least size 256x256

• 1 f 4 or 1 f 8

– user could tolerate up to a 10 minute time wait• 1 r 13

– reasonable upper bound• 0 su (50 x acquisition period x c)

User-directed

• Feasible?– Use dynamic load information– if work allocation found

• Better? – e.g.

1. (1, 6, 4) - best f

2. (2, 2, 8) - good su/r

3. (2, 1, 20) - best r

reduction factor

projections per refresh

service units

generaterequest

displaytriples

adjustrequest

reviewtriples

processrequest

findwork

allocation

executeon-line PT

accepts one

rejects all

infeasible

feasible

User-directed AppLeS

Triple Search

• Search parameter space– If triple satisfies constraints feasible

• Constrained optimization problem based on soft real-time execution– compute constraint– transfer constraint

• Heuristics to reduce search space– e.g. assume user will always choose (1,2,1)

over (1,2,4)

Work Allocation

work allocation

transfer constraints

user constraints

compute constraints

cpu availability

processor availability

ptomo-to-writer bandwidth

subnet-to-writer bandwidth

Multiple mixed-integer programs approx soln

Experiments

• Impact of dynamic information on scheduler performance

• Usefulness of tunability Grid environments

• Scheduling latency

Dynamic Information

• We fix the triple and let schedulers determine work allocation

Infinite bandwidth

Dynamic bandwidth

Dedicated cpu

wwa wwa+bw

Dynamic cpu

wwa+cpu AppLeS

• Evaluate schedulers– Repeatibility – Long makespan– several resource environments

• Simgrid (Casanova [CCGrid’2001])– API for evaluating scheduling algorithms

• tasks• resources modeled using traces

– E.g. Parameter sweep applications [HCW’00]

• Simtomo

Simulation

relative refresh lateness

expected refresh period

actual refresh period

• Relative refresh lateness

Performance Metric

NCMIR experiments

• Traces (8 machines)– 8 hour work day on March 8th, 2001

• Ran simulations throughout day at 10 minute intervals

8:00 am 4:00 pm

Perfect Load Predictions

0 1 2 3 4 5 6 7 810

hours since 3/8/2001 - 8:00 PST

wwawwa+cpuwwa+bwAppLeS

Imperfect Load Predictions

0 1 2 3 4 5 6 7 810

hours since 3/8/2001 - 8:00 PST

Synthetic Grids

• Bandwidth predictibility– Average prediction error

– pi {L, M, H}

– p1 p2 p3

• e.g. LMH

– 27 types– 2510 Grids

x 4 schedulers

– 10,040 simulations

writer

cluster3

cluster2

cluster1

wwa wwa+cpu wwa+bw AppLeS 0

scheduler

s1st2nd3rd4th

Relative Scheduler Performance

705.89 658.91 127.10 1.07

Partial Ordering

• Performance vs. bandwidth predictability

• Grid predictibility– Partial orders using p1 p2 p3

– Comparable/Not Comparable• e.g. HML is comparable to HLL• e.g. HLM is not comparable to LHM

• HHH, HHM, HMM, HLM, MLM, LLM, LLL

Example Partial Order

HHH HHM HMM HLM MLM LLM LLL . 10

Tunability Experiments

• How useful is tunability?– variability

• Fixed topology– categorized traces

• L, M, H

– v1 v2 v3 v4 v5

– 243 Grid types cluster2

cluster1

writer

supercomputer

Tunability Experiments

• Run over a 2 day period– back-to-back– assume single user

model• f, r, su

• Set of triples chosen– T = {1,…,61}

Tunability Results

parameters

• Count how many times a triple changed per 2-day simulation

• e.g.– 12.9%– 25.7%

0 2 4 6 8 100

seconds

Scheduling Latency

• Time to search for feasible triples• e.g.

– 88% under 1 sec– 63% under 1 sec

Conclusions and Future Work

• Grid-enabled version of on-line parallel tomography– Tunable application

• Tunability is useful in Grid environments

– User-directed AppLeS• Importance of bandwidth predictability

– e.g. rescheduling

• Scheduling latency is nominal

• Production use

1 On-line Parallel Tomography Shava Smallen UCSD.

Documents

Transcript of 1 On-line Parallel Tomography Shava Smallen UCSD.

Spending, Staffing, Service and Infrastructure Answering Four Questions about Information Technology COSTS David Smallen, VP for Information Technology,

Grappa Grid Access Portal for Physics Applications CHEP 2003 UCSD March 24-28,2003 Daniel Engh (UC), Shava Smallen (IU), Liang Fang (IU), Jerry Gieraltowski.

Https://portal.futuregrid.org Experiences with the FutureGrid Testbed UC Cloud Summit UCLA April 19, 2011 Shava Smallen ssmallen@sdsc.edu.

Lecture Slides - UCSD

UCSD Protected Troubleshooting

XCAT Science Portal Status & Future Work July 15, 2002 Shava Smallen Extreme! Computing Laboratory Indiana University.

Benchmarks: Helping Your President Understand IT Investments David Smallen, Hamilton College Educause 2004 October 13, 2004.

College UCSD

UCSD Cognitive Sciencecoulson/cogs179/ganis96.pdf · 2006. 1. 13. · UCSD Cognitive Science

CSE 160/Berman Grid Computing 2 ://legion/ (thanks to shava and holly.

UCSD Mathematics | Homecsorense/teaching/math205/Tate_Weil.pdf · 2014. 2. 26. · UCSD Mathematics | Home

05.19.11 | UCSD Guardian

05.31.12 | UCSD Guardian

James Messina (UCSD) and Donald Rutherford (UCSD) Abstractphilosophyfaculty.ucsd.edu/faculty/rutherford/papers/LeibnizCom... · James Messina (UCSD) and Donald Rutherford (UCSD) Abstract

Dollar Shava Club - for women

08 Shava Nerad -- igbos i_

04.05.12 | UCSD Guardian

Shava Private Sector

Navigation-Driven Evaluation of Virtual Mediated Views Bertram Ludäscher, SDSC/UCSD Yannis Papakonstantinou, UCSD Pavel Velikhov, UCSD Overview Mediator.

UCSD Triton Hall