Yangqing Jia at AI Frontiers: Towards Better DL Frameworks
-
Upload
ai-frontiers -
Category
Technology
-
view
413 -
download
1
Transcript of Yangqing Jia at AI Frontiers: Towards Better DL Frameworks
![Page 1: Yangqing Jia at AI Frontiers: Towards Better DL Frameworks](https://reader031.fdocuments.in/reader031/viewer/2022030218/5886c3a81a28abcc7d8b58d9/html5/thumbnails/1.jpg)
Towards Better DL Frameworks
Yangqing JiaResearch Lead on AI Platforms, Facebook
![Page 2: Yangqing Jia at AI Frontiers: Towards Better DL Frameworks](https://reader031.fdocuments.in/reader031/viewer/2022030218/5886c3a81a28abcc7d8b58d9/html5/thumbnails/2.jpg)
Source: XKCD, [Girshick et al. CVPR 2014]
![Page 3: Yangqing Jia at AI Frontiers: Towards Better DL Frameworks](https://reader031.fdocuments.in/reader031/viewer/2022030218/5886c3a81a28abcc7d8b58d9/html5/thumbnails/3.jpg)
• Researchers: "I will need to reproduce the ResNet paper."
• Companies: "I need to apply DL to drive cars."
The NeedsTwo sides of the same coin
![Page 4: Yangqing Jia at AI Frontiers: Towards Better DL Frameworks](https://reader031.fdocuments.in/reader031/viewer/2022030218/5886c3a81a28abcc7d8b58d9/html5/thumbnails/4.jpg)
• A grad student driven project• Started by doing one job really well: image
classification• Adopted by industry participants• Popular deep learning framework run by a non-
profit.
Yet very minimal (10k LOC)
Democratizing Deep Learning w/ CaffeGetting AlexNet running in 10 mins
![Page 5: Yangqing Jia at AI Frontiers: Towards Better DL Frameworks](https://reader031.fdocuments.in/reader031/viewer/2022030218/5886c3a81a28abcc7d8b58d9/html5/thumbnails/5.jpg)
http://caffe.berkeleyvision.org/
![Page 6: Yangqing Jia at AI Frontiers: Towards Better DL Frameworks](https://reader031.fdocuments.in/reader031/viewer/2022030218/5886c3a81a28abcc7d8b58d9/html5/thumbnails/6.jpg)
What makes a better DL library?
???
![Page 7: Yangqing Jia at AI Frontiers: Towards Better DL Frameworks](https://reader031.fdocuments.in/reader031/viewer/2022030218/5886c3a81a28abcc7d8b58d9/html5/thumbnails/7.jpg)
"MAPS"! !!
![Page 8: Yangqing Jia at AI Frontiers: Towards Better DL Frameworks](https://reader031.fdocuments.in/reader031/viewer/2022030218/5886c3a81a28abcc7d8b58d9/html5/thumbnails/8.jpg)
"MAPS"-
Scalability
![Page 9: Yangqing Jia at AI Frontiers: Towards Better DL Frameworks](https://reader031.fdocuments.in/reader031/viewer/2022030218/5886c3a81a28abcc7d8b58d9/html5/thumbnails/9.jpg)
ScalabilityRun fast, run far
“How do I train on multiple GPUs and machines?”
- Probably the most question we got from Caffe users
![Page 10: Yangqing Jia at AI Frontiers: Towards Better DL Frameworks](https://reader031.fdocuments.in/reader031/viewer/2022030218/5886c3a81a28abcc7d8b58d9/html5/thumbnails/10.jpg)
ScalabilityRun fast, run far
L1 L2 L3 L3b L2b L1b U3 U2 U1
![Page 11: Yangqing Jia at AI Frontiers: Towards Better DL Frameworks](https://reader031.fdocuments.in/reader031/viewer/2022030218/5886c3a81a28abcc7d8b58d9/html5/thumbnails/11.jpg)
ScalabilityRun fast, run far
L1 L2 L3 L3b L2b L1b U3 U2 U1R3 R2 R1
![Page 12: Yangqing Jia at AI Frontiers: Towards Better DL Frameworks](https://reader031.fdocuments.in/reader031/viewer/2022030218/5886c3a81a28abcc7d8b58d9/html5/thumbnails/12.jpg)
ScalabilityRun fast, run far
L1 L2 L3 L3b L2b L1b U3 U2 U1R3 R2 R1
L1 L2 L3 L3b L2b L1b U3 U2 U1R3 R2 R1
![Page 13: Yangqing Jia at AI Frontiers: Towards Better DL Frameworks](https://reader031.fdocuments.in/reader031/viewer/2022030218/5886c3a81a28abcc7d8b58d9/html5/thumbnails/13.jpg)
ScalabilityRun fast, run far
L1 L2 L3 L3b L2b L1b
U3 U2 U1R3 R2 R1
L1 L2 L3 L3b L2b L1b
U3 U2 U1R3 R2 R1
![Page 14: Yangqing Jia at AI Frontiers: Towards Better DL Frameworks](https://reader031.fdocuments.in/reader031/viewer/2022030218/5886c3a81a28abcc7d8b58d9/html5/thumbnails/14.jpg)
The Return of MPI"I'm your father", said Allreduce.
AllreduceTree based - O(MlogN)
Ring based - O(M)etc.
![Page 15: Yangqing Jia at AI Frontiers: Towards Better DL Frameworks](https://reader031.fdocuments.in/reader031/viewer/2022030218/5886c3a81a28abcc7d8b58d9/html5/thumbnails/15.jpg)
ScalabilitySitting on top of giants
... and many more
![Page 16: Yangqing Jia at AI Frontiers: Towards Better DL Frameworks](https://reader031.fdocuments.in/reader031/viewer/2022030218/5886c3a81a28abcc7d8b58d9/html5/thumbnails/16.jpg)
"MAPS"-
Portability
![Page 17: Yangqing Jia at AI Frontiers: Towards Better DL Frameworks](https://reader031.fdocuments.in/reader031/viewer/2022030218/5886c3a81a28abcc7d8b58d9/html5/thumbnails/17.jpg)
Portable SystemCloud, Mobile, IoT, Cars, Drones, Coffee makers
AI Math and Algorithms
Deployment Platforms
![Page 18: Yangqing Jia at AI Frontiers: Towards Better DL Frameworks](https://reader031.fdocuments.in/reader031/viewer/2022030218/5886c3a81a28abcc7d8b58d9/html5/thumbnails/18.jpg)
![Page 19: Yangqing Jia at AI Frontiers: Towards Better DL Frameworks](https://reader031.fdocuments.in/reader031/viewer/2022030218/5886c3a81a28abcc7d8b58d9/html5/thumbnails/19.jpg)
Portable SystemCloud, Mobile, IoT, Cars, Drones, Coffee makers
Model
auto predictor = caffe2::Predictor(model_file)
public class Predictor implements Caffe2ModelInterface;
![Page 20: Yangqing Jia at AI Frontiers: Towards Better DL Frameworks](https://reader031.fdocuments.in/reader031/viewer/2022030218/5886c3a81a28abcc7d8b58d9/html5/thumbnails/20.jpg)
Still, a lot of thoughts needed
• Limited computation• Battery life is a thing• Our models may be luxurious• Ecosystem less developed
Portable System Challenges
![Page 21: Yangqing Jia at AI Frontiers: Towards Better DL Frameworks](https://reader031.fdocuments.in/reader031/viewer/2022030218/5886c3a81a28abcc7d8b58d9/html5/thumbnails/21.jpg)
![Page 22: Yangqing Jia at AI Frontiers: Towards Better DL Frameworks](https://reader031.fdocuments.in/reader031/viewer/2022030218/5886c3a81a28abcc7d8b58d9/html5/thumbnails/22.jpg)
"MAPS"-
Augmented Comp Patterns
![Page 23: Yangqing Jia at AI Frontiers: Towards Better DL Frameworks](https://reader031.fdocuments.in/reader031/viewer/2022030218/5886c3a81a28abcc7d8b58d9/html5/thumbnails/23.jpg)
Augmented Comp PatternsForget about float dense math, the world is bigger
• Quantized Computation• Sparse Math Libraries• Model Compression• Rethinking Existing Operations
![Page 24: Yangqing Jia at AI Frontiers: Towards Better DL Frameworks](https://reader031.fdocuments.in/reader031/viewer/2022030218/5886c3a81a28abcc7d8b58d9/html5/thumbnails/24.jpg)
Quantized ComputationForget about float, the world is bigger
8 23
5 10
16
8
floatfp16
fixed16fixed8
![Page 25: Yangqing Jia at AI Frontiers: Towards Better DL Frameworks](https://reader031.fdocuments.in/reader031/viewer/2022030218/5886c3a81a28abcc7d8b58d9/html5/thumbnails/25.jpg)
Quantized ComputationForget about float, the world is bigger
float add
fp16 add
fixed16 add
fixed8 add
0.9
0.4
0.05
0.03
float mul
fp16 mul
fixed8 mul
4.0
1.0
0.2
![Page 26: Yangqing Jia at AI Frontiers: Towards Better DL Frameworks](https://reader031.fdocuments.in/reader031/viewer/2022030218/5886c3a81a28abcc7d8b58d9/html5/thumbnails/26.jpg)
Why?
Source: Nvidia https://devblogs.nvidia.com/parallelforall/mixed-precision-programming-cuda-8/
![Page 27: Yangqing Jia at AI Frontiers: Towards Better DL Frameworks](https://reader031.fdocuments.in/reader031/viewer/2022030218/5886c3a81a28abcc7d8b58d9/html5/thumbnails/27.jpg)
Rethinking Existing OperationsResNEXT is coming to town
gconv gconv
g g g g g g g g g ...
g
g g g g g g g g g ...
g
AlexNet Group Conv
ResNext
![Page 28: Yangqing Jia at AI Frontiers: Towards Better DL Frameworks](https://reader031.fdocuments.in/reader031/viewer/2022030218/5886c3a81a28abcc7d8b58d9/html5/thumbnails/28.jpg)
Augmented Math ChallengesForget about float, the world is bigger
• Solutions• Eigen fp16• CuDNN• NNPack• gemmlowp
• Challenges• Seamless
conversion?• Model training?• Performance tuning?• ...
![Page 29: Yangqing Jia at AI Frontiers: Towards Better DL Frameworks](https://reader031.fdocuments.in/reader031/viewer/2022030218/5886c3a81a28abcc7d8b58d9/html5/thumbnails/29.jpg)
"MAPS"-
Modularity
![Page 30: Yangqing Jia at AI Frontiers: Towards Better DL Frameworks](https://reader031.fdocuments.in/reader031/viewer/2022030218/5886c3a81a28abcc7d8b58d9/html5/thumbnails/30.jpg)
A Repeated Pattern
Many key components in deep learning are
reusable across frameworks.
![Page 31: Yangqing Jia at AI Frontiers: Towards Better DL Frameworks](https://reader031.fdocuments.in/reader031/viewer/2022030218/5886c3a81a28abcc7d8b58d9/html5/thumbnails/31.jpg)
In 2013 it used to be...
Caffe Torch Theano ...
![Page 32: Yangqing Jia at AI Frontiers: Towards Better DL Frameworks](https://reader031.fdocuments.in/reader031/viewer/2022030218/5886c3a81a28abcc7d8b58d9/html5/thumbnails/32.jpg)
Unix Philosophy?
Applications
Caffe, Torch, TF, MXNet, etc...
Core MathEigen
CuDNN NNPackTHNNMKL
CommsNCCL
MPIZeroMQ
Redis...
Low LevelCUDA
OpenGLOpenCLVulkan
...
Compilers
DataBasesLevelDB RocksDBHadoop
Amazon S3your old disk
or, "UnFramework"
![Page 33: Yangqing Jia at AI Frontiers: Towards Better DL Frameworks](https://reader031.fdocuments.in/reader031/viewer/2022030218/5886c3a81a28abcc7d8b58d9/html5/thumbnails/33.jpg)
ModularDesigns
MAPS for a good frameworkAugmented
MathematicsPortableSystem
Scalability
Interface toExistingToolkits
EfficientMobile
Runtimes
Tuned CollectivePrimitives
Optimized Math
Libraries
+Flexible Framework Design
![Page 34: Yangqing Jia at AI Frontiers: Towards Better DL Frameworks](https://reader031.fdocuments.in/reader031/viewer/2022030218/5886c3a81a28abcc7d8b58d9/html5/thumbnails/34.jpg)
No Silver Bullet?
![Page 35: Yangqing Jia at AI Frontiers: Towards Better DL Frameworks](https://reader031.fdocuments.in/reader031/viewer/2022030218/5886c3a81a28abcc7d8b58d9/html5/thumbnails/35.jpg)
There is no silver bullet
Industry:StabilityScale & speedData IntegrationRelatively Fixed
Research:Flexible
Fast IterationDebuggable
Relatively bare-bone
Caffe Torch
TheanoTensorFlowD4J etc.
![Page 36: Yangqing Jia at AI Frontiers: Towards Better DL Frameworks](https://reader031.fdocuments.in/reader031/viewer/2022030218/5886c3a81a28abcc7d8b58d9/html5/thumbnails/36.jpg)
There is no silver bullet
Industry:StabilityScale & speedData IntegrationRelatively Fixed
Research:Flexible
Fast IterationDebuggable
Relatively bare-bone
Caffe Torch
![Page 37: Yangqing Jia at AI Frontiers: Towards Better DL Frameworks](https://reader031.fdocuments.in/reader031/viewer/2022030218/5886c3a81a28abcc7d8b58d9/html5/thumbnails/37.jpg)
“In open source, we feel strongly thatto really do something well,
you have to get a lot of people involved.”
— Linus Torvalds
![Page 38: Yangqing Jia at AI Frontiers: Towards Better DL Frameworks](https://reader031.fdocuments.in/reader031/viewer/2022030218/5886c3a81a28abcc7d8b58d9/html5/thumbnails/38.jpg)
Thank you!
Towards Better Deep Learning FrameworksYangqing Jia, Research Lead on AI Platforms, Facebook