S8901 – Quadro for AI, VR and...
Transcript of S8901 – Quadro for AI, VR and...
S8901 – Quadro for AI, VR and Simulation
Carl Flygare, PNYQuadro Product Marketing Manager
Allen Bourgoyne, NVIDIASenior Product Marketing Manager
“The question of whether a computer can think is no more interesting than the question of whether a submarine can swim.”Edsger Dijkstra
Intelligence Abounds in NatureA very small sampling
Technological IntelligenceHomo sapiens’ essential differentiator
Thalmocortical brain network
3 million neurons, 476 million synapses
Full human brain
106 billion neurons, 1,000 trillion synapses
Artificial Intelligence: Where we Stand TodayGoogle’s IQ is slightly below a six-year-old human’s
Google 47.28 | 78.42% increase since 2014
Baidu 32.92 | 40.08% increase since 2014
Microsoft Bing 31.98
Apple Siri 23.90
Source: http://www.zdnet.com/article/google-ai-vs-siri-vs-bing-iq-tests-show-one-is-smartest-by-a-mile/
AI IQ’s significantly lower than an 18-year-old’s average 97 score
In 2014 two of the three researchers found Google had an IQ of 26.5 compared to Baidu’s 23.5
NVIDIA QuadroEvery segment benefits from AI, VR and simulation
Manufacturing CAE Media and Entertainment Automotive
AEC Energy (Oil and Gas) Scientific and Technical Healthcare
Entry
NVIDIA Quadro | AI, VR and Simulation Open New Possibilities
Small and Simple CAD Models, Entry PLM
Medium Size and Complexity CAD Models,PLM, Basic DCC, Medical Imaging
Professional VR, Complex CAD Models, CAE, PhotorealisticRendering, Complex DCC and VFX, Medical Imaging
P4000 8 GB
Professional VR, Very Complex CAD Models, CAE, PhotorealisticRendering, Advanced DCC and VFX, 3D Medical Imaging
P5000 16 GB
P6000 24 GBCollaborative VR, Extremely Complex CAD Models, CAE, PhotorealisticRendering, DCC and VFX, Seismic Exploration, 3D Medical Imaging
P620 2 GB
P400 2 GB
P2000 5 GB
P1000 4 GB
GP100 16 GBAI (Deep Learning) Development, Collaborative VR, CAE Simulations,Ultimate CAD Models, Photorealistic Rendering and GPGPU Compute
Basic Mid Range Upper Range High End Ultra High End
GV100 32 GB
Entry
NVIDIA Quadro | AI, VR and Simulation Open New Possibilities
Small and Simple CAD Models, Entry PLM
Medium Size and Complexity CAD Models,PLM, Basic DCC, Medical Imaging
Professional VR, Complex CAD Models, CAE, PhotorealisticRendering, Complex DCC and VFX, Medical Imaging
P4000 8 GB
Professional VR, Very Complex CAD Models, CAE, PhotorealisticRendering, Advanced DCC and VFX, 3D Medical Imaging
P5000 16 GB
P6000 24 GBCollaborative VR, Extremely Complex CAD Models, CAE, PhotorealisticRendering, DCC and VFX, Seismic Exploration, 3D Medical Imaging
P620 2 GB
P400 2 GB
P2000 5 GB
P1000 4 GB
GP100 16 GBAI (Deep Learning) Development, Collaborative VR, CAE Simulations,Ultimate CAD Models, Photorealistic Rendering and GPGPU Compute
Basic Mid Range Upper Range High End Ultra High End
GV100 32 GB
NVIDIA Quadro GP100NVIDIA Quadro GV100 | Reinventing the Workstation for AI
NVIDIA Quadro GP100NVIDIA Quadro GV100 x 2 | NVLink Scalable Workstation AI
NVIDIA Quadro GV100 and NVLinkScaling performance and memory*
*Application support for NVLink required. Maximum of two GV100 boards can be connected with NVLink.
High speed GPU and memory connection for GV100
▪ NVLink combines two GV100s for twice the compute power and 64 GB of memory
▪ Up to 200 GB/sec bidirectional bandwidth, 25% improvement
▪ Used in pairs, two dedicated NVLink connectors on GV100 boards
▪ Provides SLI functionality for GV100 boards
NVIDIA Quadro GV100Technical specifications
GPU Architecture Volta
CUDA and Tensor Cores 2560 (FP64), 5120 (FP32), 640 (Tensor)
Memory Capacity 32 GB HBM2
Peak Memory Bandwidth 870 GB/sec
FP64 (Double Precision) 7.4 TFLOPS | 42% improvement
FP32 (Single Precision) 14.8 TFLOPS | 44% improvement
FP16 (Half Precision) 118.5 TFLOPS (Matrix Multiply with FP16 or 32 Accumulate)
INT8 (Integer) 59.3 TOPS | 26% improvement
System Interface PCI Express Gen 3 x16
NVLink 200 GB/sec Bidirectional | 25% improvement
Display Connectors 4x DisplayPort 1.4 with HDCP 2.2
4K Display Support 4x 4096 x 2160 at 120 Hz with HDR
5K Display Support 4x 5120 x 2880 at 60 Hz with HDR
8K Display Support 2x 7680 x 4320 at 60 Hz with HDR
VR Ready and Stereo Yes, Stereo via 3-pin mini-DIN Connector Bracket
NVIDIA Quadro GV100Unmatched compute capabilities
INT8 59.3
FP64FP32FP16
7.414.8118.5
TFLOPSTFLOPSTFLOPSTOPS
NVIDIA Quadro GV100Features and benefits relative to GP100
GP100 GV100 Benefit
GPU Architecture Pascal Volta Most powerful, efficient and AI optimized GPU
CUDA Cores 3584 5120 Significantly greater compute and rendering performance
FP64 Performance 5.2 TFLOPS 7.2 TFLOPS 1.4x greater FP64 compute performance
Memory Size 16 GB HBM2 32 GB HBM2 2.0x memory capacity
Memory Bus Width 4096-bit 4096-bit Radically advanced memory bus implementation
Peak Memory Bandwidth 717 GB/sec 870 GB/sec Move data to and from GPU 1.2x faster
Display Support 4x DP 1.4 + 1x DVI-D DL 4x DP 1.4 and HDCP 2.2 Supports four 4K, 5K or 8K displays, latest HDCP
HDR Image Support Yes Yes More lifelike images
Advanced Display Quadro Sync II Quadro Sync II Synchronize up to 8 GPUs per system
VR Ready Yes Yes, GV100 implements full suite of hardware optimizations
NVLink NVLink (First Generation) NVLink (Second Generation) Higher performance means lower latency
Board Power 235 W 250 W Better performance per Watt
Auxiliary Power Connector 8-pin PCIe 8-pin PCIe Simplified power supply connectivity
Form Factor 4.4” H x 10.5” L Dual Slot 4.4” H x 10.5” L Dual Slot No significant mechanical or thermal changes
NVIDIA Quadro GV100Redefines state of the art across essential solutions
Artificial Intelligence
Tensor processor cores
NVIDIA GPU deep learning stack
ISV DL and ML framework optimization
Iterate and innovate faster
Reduce training time
RTX Rendering
Unrivaled FP32 performance
Largest models in GPU memory
AI accelerated photorealistic rendering
Neural network character animation
Apply AI to simultaneous video streams
Compute
Industry leading HPC capabilities
Work with largest datasets
Integrate simulation into design process
Utilize generative design algorithms
Fastest FEA, CFD, CEM available
Immersive Visualization (VR)
Includes VR hardware optimizations
Full NVIDIA VRWORKS support
Create new AI-augmented technologies
Visualize the largest datasets
Collaborative VR environments (Holodeck)
Connect two GV100 boards with NVLink to provide 64 GB of memory and twice the GPU processing power in standard workstation enclosures
NVIDIA Quadro GV100RTX rendering lets you dream and create at the speed of thought
Architectural Design
Visualize cities or urban street scenes in every photorealistic detail
Product Design
Design with physically based lights and materials in realtime
Media and Entertainment
Perfect every shot with GPI accelerated and AI enhanced rendering
Work at full fidelity, utilizing massive datasets with 2x larger memory capacity
Master rendering projects interactively with AI (Deep Neural Network) technology
NVIDIA QuadroRTX supercharges rendering with AI accelerated denoising
Denoising On
20 Frames
Denoising Off
20 Frames
Denoising Off
290 Frames
High quality results with fluid visual interactivity throughout the design process
NVIDIA QuadroCompanies working with NVIDIA’s OptiX AI denoiser technology
Image courtesy of Isotropix, rendered with Clarisse and denoised with NVIDIA OptiX.
NVIDIA QuadroCAD and CAE workflow elements
Design (CAD) Simulation (CAE) Post-ProcessingPre-Processing
NVIDIA Quadro GV100Benefit from the ultimate immersive experiences
RTX Rendered Graphics Interactive Physics GPU-Accelerated AIRealtime Collaboration
2x larger memory capacity lets you work with high fidelity, massive datasets (v. GP100)
Benefit from unconstrained Holodeck experiences with full-featured VR performance and capabilities
NVIDIA Quadro GV100Realize new opportunities with AI
32 GB or 64 GB capacity (NVLink) trains neural networks with massive datasets
Develop with NVIDIA optimized Deep Learning frameworks and deploy with NGC interoperability and scalability
Accelerate AI training and inferencing on workstations with Tensor cores and NVLink
NGC
Retail store inferencing with Quadro by DeepBlue Technology, China
Development Aggregation Inferencing At-The-Edge
NVIDIA Quadro GV100 AI Training PerformanceUp to 2x improvement in Deep Learning training performance*
GP100 Batch Size 256
GV100 Batch Size 256
Tensor Flow ResNet-50 Training IPS
GV100 Batch Size 512
*Based on TensorFlow Resnet-50 Training. Tests run on dual Intel Xeon E5 2690 v4 at 2.6 GHz, NVIDIA driver version 390.19, ResNet-50 Training.
400
300
200
100
500
600
700
GP100 Batch Size 128
GV100 Batch Size 128
Caffe ResNet-50 Training IPS
GV 100 Batch Size 256
500
400
300
200
100
600
700
800
NVIDIA Quadro GV100 Deep Learning Training PerformanceOver 2x improvement in Deep Learning training and inference performance*
1
Batch Size
TensorRT ResNet-50 Inference
8
*Based on TensorFlow Resnet-50 Training, TensorRT ResNet-50 Inference tests. Tests run on dual Intel Xeon E5 2690 v4 at 2.6 GHz, NVIDIA driver version 390.19, ResNet-50 Training.
400
300
200
100
500
600
700
GP100 Batch Size 256
GV100 Batch Size 256
Tensor FlowResNet-50 Training
GV 100 Batch Size 512
400
300
200
100
500
600
700
2 4
NVIDIA Quadro GV100 Scientific Compute PerformanceMore than 2x improvement over the previous generation*
GP100
LAAMPS Atomic Fluid Benchmark
GV100
*Based LAMMPS molecular modeling benchmark. Tests run on dual Intel Xeon E5 2690 v4 at 2.6 GHz, NVIDIA driver version 390.19, ResNet-50 Training.
400
300
200
100
500
600
700
FP32 FP64
CUDA Basic Linear Algebra Solver Benchmark
FP16
1.0
0.5
1.5
2.0
CUBLABS 2560 x 2048 x 8192
NVIDIA Quadro GV100 CAE ExampleSignificant ANSYS Mechanical 19 Acceleration*
*Power Supply Module (V19cg-1). 2x Xeon E5-2699 v4 at 2.2 GHz, 22 cores, HT off, NVIDIA driver 390.40 TCC, 256GB DRAM, CentOS 7.2.1511 64-bit. ANSYS Mechanical 19 benchmark model. Steady state thermal analysis of a power supply module, 5.3 Mdofs, JCG, real-value symmetric.
0 1 2 3 4
4 CPU Cores
3 CPU Cores + GV100
8 CPU Cores
8 CPU Cores + GV100
16 GPU Cores
Base License | 1.0
Base License | 2.65
Base + 4 HPC Licenses | 1.71
Base + 5 HPC Licenses | 3.90
Power Supply Module (V19cg-1)
Base + 12 HPC Licenses | 2.29
NVIDIA Quadro GV100 CAE ExampleStandout ANSYS Fluent 19 Acceleration*
*Power Supply Module (V19cg-1). 2x Xeon E5-2699 v4 at 2.2 GHz, 22 cores, HT off, NVIDIA driver 390.40 TCC, 256GB DRAM, CentOS 7.2.1511 64-bit. ANSYS Mechanical 19 benchmark model. Steady state thermal analysis of a power supply module, 5.3 Mdofs, JCG, real-value symmetric.
0 1 2 3 4 5 6
4 CPU Cores
3 CPU Cores + GV100
8 CPU Cores
16 CPU Cores + 2x GV100
32 CPU Cores
Base License | 1.0
Base License | 1.53
Base + 4 HPC Licenses | 1.78
Base + 5 HPC Licenses | 4.71
Pipes Model 9.6 Million Cells
Base + 2 HPC Packs | 3.29
8 CPU Cores + GV100 Base + 5 HPC Licenses | 2.67
16 CPU Cores Base + 12 HPC Licenses | 2.74
32 CPU Cores + 2x GV100 Base + 2 HPC Packs | 5.55
NVIDIA Quadro GV100 Rendering PerformanceSOLIDWORKS Visualize scales to over 29x faster than CPU*
*Based on 2x GV100, Xeon E5-2697 v3, 14 cores at 2.6 GHz, 32 GB DRAM, Win 10 Pro 64-bit Fall Creator’s Update and NVIDIA driver version 390.77. Tests run at 4K UHD (3840 x 2160) resolution.
0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30
CPU
P4000
P5000
P6000
GP100
GV100
2x GP100
2x GV100
P2000
NVIDIA Quadro GV100 Graphics PerformanceUp to 1.3x better than previous generation*
*Based on SPECviewperf 12.2.2 results.
1.4
1.0
0.6
1.2
0.8
Quadro GP100 Quadro GV100
geomean 3dsmax catia energy maya swmedicalcreo showcase snx
0.4
0.2