The Test and Evaluation Uses of Heterogeneous Computing Data Fusion 23 July 2010 Dan M. Davis...

25
The Test and Evaluation Uses of The Test and Evaluation Uses of Heterogeneous Computing Heterogeneous Computing Data Fusion 23 July 2010 Dan M. Davis [email protected] (310)448-8434 Approved for public release; distribution is unlimited.

Transcript of The Test and Evaluation Uses of Heterogeneous Computing Data Fusion 23 July 2010 Dan M. Davis...

Page 1: The Test and Evaluation Uses of Heterogeneous Computing Data Fusion 23 July 2010 Dan M. Davis ddavis@isi.edu (310)448-8434 Approved for public release;

The Test and Evaluation Uses of The Test and Evaluation Uses of Heterogeneous ComputingHeterogeneous Computing

Data Fusion

23 July 2010Dan M. Davis

[email protected]

(310)448-8434Approved for public release; distribution is unlimited.

Page 2: The Test and Evaluation Uses of Heterogeneous Computing Data Fusion 23 July 2010 Dan M. Davis ddavis@isi.edu (310)448-8434 Approved for public release;

Co-Authors

Prof. Robert F. Lucas and Gene Wagenbreth

Information Sciences InstituteUniversity of Southern CaliforniaMarina del Rey, California 90292

{ rflucas, genew } @isi.edu

Page 3: The Test and Evaluation Uses of Heterogeneous Computing Data Fusion 23 July 2010 Dan M. Davis ddavis@isi.edu (310)448-8434 Approved for public release;

Overview

• Configuration and Design Considerations

• GPU Training and Algorithmic Programming

• Current Contributions and Research

Productivity

• Robustness and Utility of GPGPU Cluster

• Plans and Opportunities for Cluster

• Lessons Learned regarding GPGPUs

Page 4: The Test and Evaluation Uses of Heterogeneous Computing Data Fusion 23 July 2010 Dan M. Davis ddavis@isi.edu (310)448-8434 Approved for public release;

Thesis

Heterogeneous Computing (GPGPUs, FPGAs, STI Cells, …) holds promise for the future

FMS and T&E have a need for HPCIn 2007 HPCMP awarded a 256-Node,

GPGPU-Enhanced Linux Cluster Joshua, to JFCOM

This asset has proven stable and useful Many of the useful functions of GPGPUs will

be applicable in the T&E community

Page 5: The Test and Evaluation Uses of Heterogeneous Computing Data Fusion 23 July 2010 Dan M. Davis ddavis@isi.edu (310)448-8434 Approved for public release;

Joshua GPGPU-Enhanced Linux Cluster at JFCOM

J9/J7 Machine Room

Suffolk Virginia

Page 6: The Test and Evaluation Uses of Heterogeneous Computing Data Fusion 23 July 2010 Dan M. Davis ddavis@isi.edu (310)448-8434 Approved for public release;

JFCOM as an GPGPU User

● U.S. Joint Forces Command, Norfolk, Virginia • One of DoD’s combatant commands• Key role in transforming defense capabilities• Current JFCOM Commander: Gen James Mattis, USMC

● Two JFCOM Directorates using agent-based simulations• J7 - Training

- trains forces- develops doctrine- leads training requirements analysis - provides an interoperable training environment

• J9 - Concept Development and Experimentation- develops innovative joint concepts and capabilities - provides proven solutions to problems facing the joint force

● Simulations are typically GenSer Secret and characterized by:• Interactive use by hundreds of personnel • Distributed trans-continentally, but must be real time• Vast majority of users at the terminals are uniformed

warfighters

Page 7: The Test and Evaluation Uses of Heterogeneous Computing Data Fusion 23 July 2010 Dan M. Davis ddavis@isi.edu (310)448-8434 Approved for public release;

Simulation Federates

● Agent-based models use rules for entity behavior• Autonomous-agent entities• Can be Human-In-The-Loop (HITL) and run in real time• Large compute clusters required to run large-scale

simulations

● Standard interface is HLA RTI communication (IEEE 1516)• Supplanted to old DIS• Publish/Subscribe model• USC/Caltech Software Routers scale better

● Common Codes in use at JFCOM:• Joint Semi-Automated Forces (JSAF)• “Culture”, stripped-down civilian instantiation of JSAF• Simulating Location & Attack of Mobile Enemy Missiles

(SLAMEM)• OneSAF

Page 8: The Test and Evaluation Uses of Heterogeneous Computing Data Fusion 23 July 2010 Dan M. Davis ddavis@isi.edu (310)448-8434 Approved for public release;

GPGPU Justification

NEED - 24x7x365 enhanced, distributed and scalable compute resources to enable joint warfighters at JFCOM … to develop, explore, test, and validate 21st century battlespace concepts … to enhance global-scale, computer-generated military experimentation by sustaining more than 2,000,000 entities on appropriate terrain with valid phenomenology.

APPROACH – Enable further growth in entity count, entity complexity, and environmental/Infrastructure settings by employing large Linux cluster with General Purpose GPUs (GPGPU) on each node to aid in line-of-sight, route planning, plume representation, all capable of running faster than real time.

CHALLENGES – Effectively implementing Hardware configuration to provide stable and useful platform, motivate/train operators to utilize GPGPUs, and program simulations to take advantage of GPGPUs

Page 9: The Test and Evaluation Uses of Heterogeneous Computing Data Fusion 23 July 2010 Dan M. Davis ddavis@isi.edu (310)448-8434 Approved for public release;

Cluster Configuration as

Delivered

• 256 NodesNodes - (2) AMD Santa Rosa 2220 2.8 GHz dual-core

processorsGPUs - (1) NVIDIA 8800 Video CardNode Chassis - 4U chassisMemory - 16 GB DIMM DDR2 667 per node

• GigE Inter-node Communications• Delivery to:

Joint Advanced Tactics and Training Laboratory (JATTL) in Suffolk, VA

Page 10: The Test and Evaluation Uses of Heterogeneous Computing Data Fusion 23 July 2010 Dan M. Davis ddavis@isi.edu (310)448-8434 Approved for public release;

Perspective:Entity Growth vs.

Time

Nu

mb

er

an

d

Com

ple

xit

y o

f JS

AF

En

titi

es

JSAF/SPP Joshua (2008)

10,000,010,000,00000

UE 98-1

(1997)

JSAF/SPP Capability (2006)

JSAF/SPP Urban

Resolve (2004)

JSAF/SPP

Tests (2004)

J9901 (1999)

SAF Expres

s (1997)

3,600 3,600 12,000 12,000 107,00107,00

0 0

AO-00 (2000)

50,000 50,000

1,400,001,400,00

1,000,001,000,0000

250,000250,000

SPP Proof of Principle DARPA / Caltech

Experiments continue to require orders of magnitude larger &

more complex battlespaces

SCALEand FIDELITY

DC Clusters at MHPCC & ASCMSRC

DHPI GPU-

Enhanced Cluster

Page 11: The Test and Evaluation Uses of Heterogeneous Computing Data Fusion 23 July 2010 Dan M. Davis ddavis@isi.edu (310)448-8434 Approved for public release;

Why GPUs?

●GPU performance can be 100X hosts• Don’t forget Prof. Gene Amdahl, 2-3X

typical• This differential is expected to grow

●Early OneSAF work (UNC & SAIC)• Line of Sight• Route Finding• Collision Detection• Sparse Matrix Factorization (see RFLucas

paper)

●ISI verified they’re also bottlenecks in JSAF

●New ideas for use in sensor scenario creation for new multi-spectral sensors

Page 12: The Test and Evaluation Uses of Heterogeneous Computing Data Fusion 23 July 2010 Dan M. Davis ddavis@isi.edu (310)448-8434 Approved for public release;

Route Planning Performance Impact

Time Spent in Route Planning is Critical Bottleneck

Page 13: The Test and Evaluation Uses of Heterogeneous Computing Data Fusion 23 July 2010 Dan M. Davis ddavis@isi.edu (310)448-8434 Approved for public release;

Early GPU Programming

● Trained ISI staff with Sparse Matrix Solver● Then examined JSAF kernels

• Line-of-sight• Illumination • Route planning

● Route planning appeared easiest to integrate● Route planning work published at I/ITSEC● For this and other papers, see:

http://www.isi.edu/~ddavis/JESPP/JESPP_Papers.html

Page 14: The Test and Evaluation Uses of Heterogeneous Computing Data Fusion 23 July 2010 Dan M. Davis ddavis@isi.edu (310)448-8434 Approved for public release;

CUDA TrainingPET Courses

●Dr. David Pratt conceived and organized • HPCMP FAPOC for FMS

●Location & Dates: • SAIC facility Suffolk VA, 23 - 25 October

2007• ISI Marina del Rey 21- 23 October 2008• UCSD San Diego 5 – 6 March 2009

●Attendees: total ~ 60 HPCMP users●Also taught at USC as part of Parallel

Programming Class

Page 15: The Test and Evaluation Uses of Heterogeneous Computing Data Fusion 23 July 2010 Dan M. Davis ddavis@isi.edu (310)448-8434 Approved for public release;

Views of CUDA Classes in Suffolk

Virginia

Page 16: The Test and Evaluation Uses of Heterogeneous Computing Data Fusion 23 July 2010 Dan M. Davis ddavis@isi.edu (310)448-8434 Approved for public release;

Typical Problem at JFCOM

The Joint Force Commander (JFC) needs to integrate and focus collection assets for persistent surveillance.

Joint Integrated Persistent Surveillance (JIPS) Goals :Improving and integrating systemDeveloping Tactics, Techniques and Procedures (TTP’s)Improving all of the Concept of Operations (CONOPS)Maximizing tipping, cueing and communications.Using sensors to achieve persistence

Improving doctrine, organization, and TTPs

Enabling JFC to better command and support operations by: (1) effective capability apportionment and management,

(2) timely and responsive analytic support

(3) fast, reliable tactical Command, Control, Communications (C3)

Enhancing use, coordination and optimization of ISR assets

Page 17: The Test and Evaluation Uses of Heterogeneous Computing Data Fusion 23 July 2010 Dan M. Davis ddavis@isi.edu (310)448-8434 Approved for public release;

JIPS User Interface

Page 18: The Test and Evaluation Uses of Heterogeneous Computing Data Fusion 23 July 2010 Dan M. Davis ddavis@isi.edu (310)448-8434 Approved for public release;

Experimental Schedule

Page 19: The Test and Evaluation Uses of Heterogeneous Computing Data Fusion 23 July 2010 Dan M. Davis ddavis@isi.edu (310)448-8434 Approved for public release;

Benefits of GPGPU Computing

Joshua has provided many benefits; some are not easily quantified

Training, analysis or evaluation in cities otherwise off-limits due to:security issuespublic resistance to combat troops in their citydiplomatic about U.S. interest in cities of potential conflict

Joshua does save personnel costs, e.g. Army Division costs ~ $20M per day.

DHPI cluster can runs such a program using only ~100 technicians Cost saving may be ~$19.5M each day.

Good visibility with the leadership elite:Congressional visitsLieutenant General noted that it was probably the only time in his

career he would have an opportunity to command so large a unit

1,500 soldiers across the country participated, all connected by DREN to the cluster Joshua in Suffolk.

Page 20: The Test and Evaluation Uses of Heterogeneous Computing Data Fusion 23 July 2010 Dan M. Davis ddavis@isi.edu (310)448-8434 Approved for public release;

GPGPU Technical Merit

● All challenges in the proposal fully met● Joshua remains deployed and in service ● Two million entity goal exceeded (by factor of

five!)● Capability of GPU demonstrated

• Developers trained to use GPUs• Route planning kernel implemented• Other research underway

● Joshua has changed the J9 culture• New code being developed using client/server model• J9 leadership now have ownership stake in HPC

concepts

Page 21: The Test and Evaluation Uses of Heterogeneous Computing Data Fusion 23 July 2010 Dan M. Davis ddavis@isi.edu (310)448-8434 Approved for public release;

GPGPUComputational

Merit● JFCOM FMS requirements are uniquely military• Modeling of DoD operations in urban terrain• Users are most often uniformed warfighters• Recipients of research benefits are in the field today

● Needed for a large, heterogeneous ensemble of SAFs

● Cluster provides stability and mesh provides utility

● Nationally recognized research challenges• Scalable interest management to bound messages• Scaling individual behavior models• Mining distributed data logs to analyze results• More than 31 papers in competitive conferences/journals

Page 22: The Test and Evaluation Uses of Heterogeneous Computing Data Fusion 23 July 2010 Dan M. Davis ddavis@isi.edu (310)448-8434 Approved for public release;

GPGPUCurrent Progress

● Deployed and accepted at JFCOM● In use on all major J9 experiments● In use daily during development spirals for

events● Exceeded technical goal of hosting 2M entities● Classification issues led to partitioning● Joshua is now fully engaged in day-to-day

simulation experiments at JFCOM• Running ensembles of SLAMEM simulations

● Ops-tempo was expected to continue and increase• Human-in-the-loop experiments in FY10

Page 23: The Test and Evaluation Uses of Heterogeneous Computing Data Fusion 23 July 2010 Dan M. Davis ddavis@isi.edu (310)448-8434 Approved for public release;

SummaryAppropriateness

● Dedicated system was required• classified• interactive use• development not amenable to batch processing

● Linux cluster • users have adapted easily and use constantly• design and use based on experience with DC

clusters• current SAFs need only Low-cost GigE network

● Joshua has met JFCOMs requirements• in service creating data for JIPS• available for new directions

Page 24: The Test and Evaluation Uses of Heterogeneous Computing Data Fusion 23 July 2010 Dan M. Davis ddavis@isi.edu (310)448-8434 Approved for public release;

New Capabilities for T&E

Paper in Real Time Hyper-Spectral

Other T&E UsesAny line of sight calcualtionsEquation-based CFDSignals ProcessingMatrix multiply

Page 25: The Test and Evaluation Uses of Heterogeneous Computing Data Fusion 23 July 2010 Dan M. Davis ddavis@isi.edu (310)448-8434 Approved for public release;

Research Funded by JFCOM and AFRL

This material is based on research sponsored by the U.S. Joint Forces Command via a contract with the Lockheed Martin Corporation and SimIS, Inc., and on research sponsored by the Air Force Research Laboratory under agreement numbers F30602-02-C-0213 and FA8750-05-2-0204. The U.S. Government is authorized to reproduce and distribute reprints for Governmental purposes notwithstanding any copyright notation thereon. The views and conclusions contained herein are those of the authors and should not be interpreted as necessarily representing the official policies or endorsements, either expressed or implied, of the U.S. Government. Approved for public release; distribution is unlimited.