Parallel Visualization on Leadership Computing...
Transcript of Parallel Visualization on Leadership Computing...
ParallelVisualizationonLeadershipComputingResources
Oneapproachtomeetingtheincreasingdemandsforanalysisandvisualizationistoperformmoreofthesetasksonsupercomputerstraditionallyreservedforsimulations.Thiscanleadtoincreasedperformance,reducedcost,andtighterintegrationofanalysisandvisualizationincomputationalscience.Ourteamisdevelopingandoptimizingvisualization
algorithmstoscaletoarchitecturesatover10,000cores.Performanceisanalyzedforultrascaleapplications.
AlgorithmsParallelizingalgorithmssuchasvolumerenderingintopipelines,consistingeachofmanycores,maximizes
performanceonthesearchitectures.
TomPeterka,RobRoss(ANL),Han‐WeiShen(OSU),Kwan‐LiuMa(UCD),WesleyKendall(UTK),HongfengYu(SNL)
Analyses
ApplicationsFromastrophysicstoclimatemodeling,weareworkingone‐on‐onewithscientistsandtheirdatatomeettheir
large‐scalevisualizationrequirements.
ArchitecturesAtscalesoftensofthousandsofcores,visualizationalgorithmsneedtobetunedtospecificarchitectures,sowestudysystemsindetail,I/Oforexample.
AU.S.DepartmentofEnergylaboratorymanagedbyUChicagoArgonne,LLC
TheArgonneLeadershipComputingFacility’sIBMBlueGene/P(left)andtheOakRidgeLeadershipComputingFacility’sCrayXT4(right).ImagescourtesyALCFandNCCS.
Architectural diagramofthetheBlueGene/PI/Osystem
ComputenodesGatewaynodes
Commoditynetwork
Storagenodes
Enterprisestorage
BG/Ptree
5.1GB/s
Ethernet
10Gb/s
Infiniband
16Gb/s
SerialATA
3Gb/s
Volumerenderingofangularmomentumandstreamlinesofshockwavevectorfieldinsupernovadataset,courtesyofJohnBlondin.
Timelagbetweenfirstsnowfallandgreen‐upinMODISdataset,courtesyofNASA.
Parallelvolumerenderingconsistsofparallelreadingofthe
datasetintheI/Ostage,softwareraycastingin
therenderingstage,andblendingindividual
imagesinthecompositingstage.
ThecostofI/Oinrenderingatimeseriescanbemaskedbyvisualizingmultipletimestepsinparallelpipelines.Eachofthepipelinesbelowisfurtherparallelizedamongmultiplenodes.Theforwarderdaemonrunsontheloginnodeandserializesfinalresults.
Scalabilityoveravarietyofdata,image,andsystemsizes.
Therelativepercentageoftimeinthestagesofvolumerenderingasafunctionofsystemsize.
DifferentnumbersofprocessesperformingI/Ocanaffectoverallwritingperformance.
!
!
!
!
!
50 100 200 500 1000 2000 5000 20000
510
20
50
100
200
500
1000
Volume Rendering End!to!End Performance
Number of Processes
Tota
l F
ram
e T
ime (
s)
! 4480^3 data, 4096^2 image
2240^3 data, 2048^2 image
1120^3 data, 1024^2 image
Aggregateandcomponentresultsareanalyzedtodeterminebottlenecks.BecauseultrascalevisualizationisdominatedbyI/O,ourteamdevotesconsiderableefforttoitsstudy,bothfromsystemsandapplicationperspectives.