WorldScape Defense Company, L.L.C. Company Proprietary Slide 1 An Ultra-High Performance Scalable...
-
Upload
phebe-nichols -
Category
Documents
-
view
220 -
download
0
description
Transcript of WorldScape Defense Company, L.L.C. Company Proprietary Slide 1 An Ultra-High Performance Scalable...
![Page 1: WorldScape Defense Company, L.L.C. Company Proprietary Slide 1 An Ultra-High Performance Scalable Processing Architecture for HPC and Embedded Applications.](https://reader036.fdocuments.in/reader036/viewer/2022062317/5a4d1b6a7f8b9ab0599b2a65/html5/thumbnails/1.jpg)
WorldScape Defense Company, L.L.C. WorldScape Defense Company, L.L.C. Company ProprietaryCompany Proprietary Slide Slide 11
An Ultra-High Performance Scalable Processing Architecture
for HPC and Embedded Applications
PresentationFor
IPDPS Conference
28 April 2004
![Page 2: WorldScape Defense Company, L.L.C. Company Proprietary Slide 1 An Ultra-High Performance Scalable Processing Architecture for HPC and Embedded Applications.](https://reader036.fdocuments.in/reader036/viewer/2022062317/5a4d1b6a7f8b9ab0599b2a65/html5/thumbnails/2.jpg)
WorldScape Defense Company, L.L.C. WorldScape Defense Company, L.L.C. Company ProprietaryCompany Proprietary Slide Slide 22
CS301 Up Close Multi-Threaded Array
Processor 25.6 GFLOPS 3W worst-case, 2W typical 200MHz 64 PEs, 4 Kbytes each
PE ArrayPE Array
ControlControl SRAMSRAM
BusBus
ClearConnect bus 64-bit full duplex 1.6 Gbyte/s each direction 2x 0.8-Gbyte/s bridge ports
Scratchpad memory 128 Kbytes of SRAM
Availability Currently available
![Page 3: WorldScape Defense Company, L.L.C. Company Proprietary Slide 1 An Ultra-High Performance Scalable Processing Architecture for HPC and Embedded Applications.](https://reader036.fdocuments.in/reader036/viewer/2022062317/5a4d1b6a7f8b9ab0599b2a65/html5/thumbnails/3.jpg)
WorldScape Defense Company, L.L.C. WorldScape Defense Company, L.L.C. Company ProprietaryCompany Proprietary Slide Slide 33
Multi-Threaded Array Processing Architecture
Multi-threaded Array Processor Fully programmable in C Hardware multi-threading Extensible instruction set
Scalable internal parallelism Array of Processing Elements (PEs) Compute, bandwidth scale together From 10s to 1,000s of PEs Built-in PE redundancy
High performance, low power ~10 GFLOPS/Watt
Multiple high speed I/O channels
![Page 4: WorldScape Defense Company, L.L.C. Company Proprietary Slide 1 An Ultra-High Performance Scalable Processing Architecture for HPC and Embedded Applications.](https://reader036.fdocuments.in/reader036/viewer/2022062317/5a4d1b6a7f8b9ab0599b2a65/html5/thumbnails/4.jpg)
WorldScape Defense Company, L.L.C. WorldScape Defense Company, L.L.C. Company ProprietaryCompany Proprietary Slide Slide 44
Processing ElementsPEs are highly optimised execution units:• ALU, MAC, FPU• High-bandwidth, multiport register file• High bandwidth per PE DMA (PIO, SIO)• Closely coupled SRAM for data
64 PEs at 200MHz• 25.6 GFLOPS• 51.2 Gbyte/s bandwidth to PE memory• 12,800 MIPS
Supports multiple data types:• 8, 16, 24, 32-bit, ... fixed-point arithmetic• 32-bit IEEE floating-point arithmetic
![Page 5: WorldScape Defense Company, L.L.C. Company Proprietary Slide 1 An Ultra-High Performance Scalable Processing Architecture for HPC and Embedded Applications.](https://reader036.fdocuments.in/reader036/viewer/2022062317/5a4d1b6a7f8b9ab0599b2a65/html5/thumbnails/5.jpg)
WorldScape Defense Company, L.L.C. WorldScape Defense Company, L.L.C. Company ProprietaryCompany Proprietary Slide Slide 55
ClearConnectTM High-Speed BusLanes from 25 to 100Gbit/s full duplex
• Packet switched architecture• Scales to 4 lanes per bus• Lane widths: 32 to 256-bit• Distributed arbitration• Low power• Highly flexible
![Page 6: WorldScape Defense Company, L.L.C. Company Proprietary Slide 1 An Ultra-High Performance Scalable Processing Architecture for HPC and Embedded Applications.](https://reader036.fdocuments.in/reader036/viewer/2022062317/5a4d1b6a7f8b9ab0599b2a65/html5/thumbnails/6.jpg)
WorldScape Defense Company, L.L.C. WorldScape Defense Company, L.L.C. Company ProprietaryCompany Proprietary Slide Slide 77
Off the shelf Products
CS301 64 PE chip - 2W, 25 GFLOPS - Hardware Development Support
Fully functional SDK - Application Support - Software Libraries
Dual 64 PCI Development Board – 50 GFLOPS performance- Acceleration for clusters and HPC applications- Development environment for embedded applications- Growing catalog of software application libraries- Scalable with robust evolution path
![Page 7: WorldScape Defense Company, L.L.C. Company Proprietary Slide 1 An Ultra-High Performance Scalable Processing Architecture for HPC and Embedded Applications.](https://reader036.fdocuments.in/reader036/viewer/2022062317/5a4d1b6a7f8b9ab0599b2a65/html5/thumbnails/7.jpg)
WorldScape Defense Company, L.L.C. WorldScape Defense Company, L.L.C. Company ProprietaryCompany Proprietary Slide Slide 88
Systems Integration Examples
PC plug-in accelerator
Coprocessors in a PC server*
Coprocessors in a blade server*
COTS hardwareCOTS hardware
*Images courtesy of Angstrom Microsystems**Image courtesy of Office of Naval Research
Silver Fox **
AlgorithmAlgorithmdevelopmentdevelopment
for embeddedfor embeddedapplicationsapplications
![Page 8: WorldScape Defense Company, L.L.C. Company Proprietary Slide 1 An Ultra-High Performance Scalable Processing Architecture for HPC and Embedded Applications.](https://reader036.fdocuments.in/reader036/viewer/2022062317/5a4d1b6a7f8b9ab0599b2a65/html5/thumbnails/8.jpg)
WorldScape Defense Company, L.L.C. WorldScape Defense Company, L.L.C. Company ProprietaryCompany Proprietary Slide Slide 99
WorldScape’s Offering
Chip Technology - 64 PE/256 PE… - customizable…
Support Tools- SDK, VSIPL, PCA morphware…
Board Level Integration- custom, I/O, i/f, …
Application Integration- FFT, PC, HSI, SceneServer …