Post on 31-Jul-2020
An overview of use-cases and introduction to DGX
AI IS NOT ONLY DRIVING CARS
Ralph Hinsche( rhinsche@nvidia.com )
2
“Find where I parked my car”
AI IS EVERYWHERE
“Find the bag I just saw in this magazine”
“What movie should I watch next?”
3
Bringing grandmother closer to family by bridging language barrier
TOUCHING OUR LIVES
Predicting sick baby’s vitals like heart rate, blood pressure, survival rate
Enabling the blind to “see” their surrounding, read emotions on faces
4
Increasing public safety with smart video surveillance at airports & malls
AI FOR PUBLIC GOOD
Providing intelligent services in hotels, banks and stores
Separating weeds as it harvests, reduces chemical usage by 90%
5
“ Mobile computing, inexpensive sensors collecting terabytes of data, rise of machine learning that can use that data will fundamentally change the way the global economy is organized.”
Fortune, CEOs: The Revolution is Coming March 8, 2016
6
PERSONALIZATION
2 TrillionMessages Per Day On
AI INFERENCING IS EXPLODING
SPEECH TRANSLATION VIDEO
60 BillionVideo frames/day uploaded on
Youtube
140 BillionWords Per Day Translated by
500MDaily active users of
iFlyTek
7
AI INFERENCE IS THE NEXT GREAT CHALLENGE
Inferencing
DNN Model
Training
8
2016 – Baidu Deep Speech 2Superhuman Voice Recognition
2015 – Microsoft ResNetSuperhuman Image Recognition
2017 – Google Neural Machine TranslationNear Human Language Translation
100 ExaFLOPS8700 Million Parameters
20 ExaFLOPS300 Million Parameters
7 ExaFLOPS60 Million Parameters
To Tackle Increasingly Complex ChallengesNEURAL NETWORK COMPLEXITY IS EXPLODING
9
GPU-ACCELERATED INFERENCE POWERS CHINA CSPS
SPEECH RECOGNITION70% of China Market
SPEECH TO CONTEXT~1B Users
INTELLIGENT VIDEO ANALYSIS1K Channels
LANGUAGE TRANSLATION~8B Queries/Day
2.5X Throughput
+20% Accuracy
10XConcurrent Requests Per Server
20X Per-Server Efficiency
3.5X Latency Reduction
3XRequests Serviced
10
JD.COM SUBSIDIARY JDX SELECTS NVIDIA FOR AUTONOMOUS MACHINESFrom Warehouse to Door Delivery
jROVER jDRONE1 Million Drones by 2022
11
INTERNET SERVICES
DEEP LEARNING IS SWEEPING ACROSS INDUSTRIES
MEDICINE MEDIA & ENTERTAINMENT SECURITY & DEFENSE AUTONOMOUS MACHINES
Cancer cell detectionDiabetic gradingDrug discovery
Pedestrian detectionLane trackingRecognize traffic signs
Face recognitionVideo surveillanceCyber security
Video captioningContent based searchReal time translation
Image/Video classificationSpeech recognitionNatural language processing
12
THE NEW SCIENCE OF SPORTSPredictive analytics, commonly used in business to identify risks and opportunities, is increasingly used by the sports industry to tap into its massive troves of data. Scientists at NYU are applying deep learning and the NVIDIA DGX-1 AI supercomputer to analyze unprecedented amounts of Major League Baseball’s data — four years-worth of every player’s every move — to ask bigger and better questions to help improve the game.
13
TEACHING A ROBOT TO STAND UP FOR ITSELFNew approaches to AI promise to help scientists build machines with greater autonomy. Researchers at UC Berkeley are tapping into the processing power and integrated software of NVIDIA’s DGX-1 to advance robotics using reinforcement learning. DGX-1 will allow them to iterate faster and ultimately build robots that are able to understand and navigate a diverse and changing world on their own.
14
ACCELERATING DISCOVERIESWITH AINew drugs typically take 12-14 years and $2.6 billion to bring to market. BenevolentAI is using GPU deep learning for NLP to bring new therapies to market quickly and more affordably. They’ve automated the process of identifying patterns within large amounts of research literature, enabling scientists to form hypotheses and draw conclusions quicker than any human researcher could. And using the NVIDIA DGX-1 AI supercomputer, they identified two potential drug targets for Alzheimer’s in less than one month.
15
Our daily life, economic vitality, and national security depend on a stable, safe and resilient cyberspace. But attacks on IT systems are becoming more complex and relentless, resulting in loss of information and money and disruptions to essential services. Accenture’s dedicated cyber security lab uses NVIDIA GPUs, CUDAlibraries, and machine learning to accelerate the analysis and visualization of 200M-300M alerts daily so analysts can take timely action.
AI-ACCELERATED CYBER DEFENSE
16
AI IMPROVESTHE CUSTOMER EXPERIENCEAI is dramatically changing the online shopping experience with tangible improvements to retailers and consumers. In 2016 online British grocery giant Ocado improved customer service with their AI-enhanced contact center, and is applying machine learning and NVIDIA GPUs to develop humanoid robotics to assist maintenance technicians, and advanced computer vision for image classification and recognition to replace barcode systems. Computer vision will expedite the picking process and better ensure orders are filled correctly so customers receive exactly what they ordered.
17
THE MODERN WAREHOUSE BUILT ON AIWorldwide retail e-commerce sales are expected to reach $2 trillion in 2016, according to eMarketer. With thousands of orders placed every hour, data scientists at Zalando, Europe’s leading online fashion retailer, applied deep learning and GPUs to develop the Optimal Cart Pick algorithm. Applying the algorithm resulted in an 11% decrease in workers’ travel time per item picked. The work is a good example of the efficiencies that AI can discover for e-commerce, manufacturing and other large-systems-based industries.
18
AI PREDICTS AND PREVENTS DISEASEGPU deep learning is giving doctors a life-saving edge by identifying high-risk patients before diseases are diagnosed. Icahn School of Medicine at Mount Sinai built an AI-powered tool, “Deep Patient,” based on NVIDIA GPUs and the CUDA programming model. Deep Patient can analyze a patient’s medical history to predict nearly 80 diseases up to 1 year prior to onset.
19
Weather forecasting involves processing vast amounts of data to derive predictions that can save lives and protect property. Colorful Clouds is using GPU deep learning to speed the processing of data by 30-50x. It’s location-based reporting tool can forecast and communicate weather and air-quality conditions with high-accuracy in real-time.
AI-POWERED WEATHER FORECASTING
20
AN AI MONITOR OF EARTH’S VITALSThe Earth’s climate has changed throughout history, but in recent years there have been record increases in temperature, glacial retreat and rising sea levels. NASA Ames is using satellite imagery to measure the effects of carbon and greenhouse gas emissions on the planet. To do so, they developed DeepSat — a deep learning framework for satellite image classification trained on a GPU-powered supercomputer. The enhanced satellite imagery will help scientists plan to protect ecosystems and farmers improve crop production.
21
DEFENDING THE PLANET WITH AIThe U.S. government’s Asteroid Grand Challenge seeks to identify asteroid threats to human populations. The team at NASA Frontier Development Labs picked up the challenge by employing GPU deep learning to identify threats and their unique characteristics. The resulting “Deflector Selector” achieved a 98% success rate in determining which technology produced the most successful deflection.
22
AI PLATFORM TO ACCELERATE CANCER RESEARCHTo speed advances in the fight against cancer, the Cancer Moonshot initiative unites the Department of Energy, the National Cancer Institute and other agencies with researchers at Oak Ridge, Lawrence Livermore, Argonne, and Los Alamos National Laboratories. NVIDIA is collaborating with the labs to help accelerate their AI framework called CANDLE as a common discovery platform, with the goal of achieving 10X annual increases in productivity for cancer researchers.
23
Fastest AI Supercomputer in TOP5004.9 Petaflops Peak FP6419.6 Petaflops Peak FP1613x DGX-1 to get into Top500
Most Energy Efficient Supercomputer#1 Green5009.46 GFLOPS per Watt
Rocket for Cancer MoonshotCANDLE Development Platform Common platform with DOE labs – ANL, LLNL,
ORNL, LANL
NVIDIA DGX SATURNVGiant Leap Towards Exascale AI
24
TESLA V100THE MOST ADVANCED DATA CENTER GPU EVER BUILT
5,120 CUDA cores640 NEW Tensor cores7.5 FP64 TFLOPS | 15 FP32 TFLOPS120 Tensor TFLOPS20MB SM RF | 16MB Cache | 16GB HBM2 @ 900 GB/s300 GB/s NVLink
25
NEW TENSOR CORE BUILT FOR AIDelivering 120 TFLOPS of DL Performance
TENSOR CORE
ALL MAJOR FRAMEWORKSVOLTA-OPTIMIZED cuDNN
MATRIX DATA OPTIMIZATION:
Dense Matrix of Tensor Compute
TENSOR-OP CONVERSION:
FP32 to Tensor Op Data for
Frameworks
TENSOR CORE
VOLTA TENSOR CORE 4x4 matrix processing array
D[FP32] = A[FP16] * B[FP16] + C[FP32]Optimized For Deep Learning
26
REVOLUTIONARY AI PERFORMANCE3X Faster DL Training Performance
Over 80x DL Training Performance in 3 Years
1x K80cuDNN2
4x M40cuDNN3
8x P100cuDNN6
8x V100cuDNN7
0x
20x
40x
60x
80x
100x
Q115
Q315
Q217
Q216
Googlenet Training Performance(Speedup Vs K80)
Spee
dup
vs K
80
85% Scale-Out EfficiencyScales to 64 GPUs with Microsoft
Cognitive Toolkit
0 5 10 15
64X V100
8X V100
8X P100
Multi-Node Training with NCCL2.0(ResNet-50)
ResNet50 Training for 90 Epochs with 1.28M images dataset | Cognitive Toolkit with NCCL 2.0 | V100 performance measured on pre-production
hardware.
1 Hour
7.4 Hours
18 Hours
3X Reduction in Time to Train Over P100
0 10 20
1X V100
1X P100
2X CPU
LSTM Training(Neural Machine Translation)
Neural Machine Translation Training for 13 Epochs |German ->English, WMT15 subset | CPU = 2x Xeon E5 2699 V4 | V100 performance
measured on pre-production hardware.
15 Days
18 Hours
6 Hours
27
TUNE IN TO THE LATEST AI NEWS
NVIDIA AI Twitter
AI on Nvidia.com
AI Newsletter sign up
AI Podcast
Deep Learning blog
28
29
30
Saveadditional25%withthepromocode: RalphHinscheGTCEU17Specialstudentpromocode: STU1GTCEU17(1day,€75)
STUFGTCEU17(3day,€165)
32
BACK UP
33
A NEW COMPUTING MODEL
TRADITIONAL APPROACHRequires domain expertsTime consuming
Error proneNot scalable to new problems
Algorithms that learn from examples
DEEP LEARNING APPROACHLearn from dataEasily to extend
Speedup with GPUs
Expert Written Computer Program
CarVehicle
Coupe
CarVehicle
Coupe
Deep Neural Network
34
DEEP LEARNINGHow it works
Option 1- Slide is on click animation
35
DEEP LEARNINGHow it works
Option 2- No animation