Deep learning with C++ - an introduction to tiny-dnn

deep learning with c++an introduction to tiny-dnn

by Taiga Nomiembedded software engineer, Osaka, Japan

deep learning

Icons made by Freepik from www.flaticon.com is licensed by CC 3.0 BY

Facial recognition

Image understanding

Finance

Game playing

Translation

Robotics

Drug discovery

Text recognition

Video processing

Text generation

Deep learning- Learning a complicated function from many data

- Composed of trainable, simple mathematical functions

Input OutputTrainable Building Blocks

- text

- audio

- image

- video

- text

- audio

- image

- video

deep learning framework

A modern deep learning framework for C++ programmers

1400 stars 500 forks 35 contributors 100 clones/day

“A Modern Deep Learning module” by Edgar Riba

“Deep Learning with Quantization for Semantic Saliency Detection” by Yida Wang

https://summerofcode.withgoogle.com/archive/

1.Easy to introduce

2.Simple syntax

3.Extensible backends

1.Easy to introduce- Just put the following line into your cpp

tiny-dnn is header only - No installation

tiny-dnn is dependency-free - No prerequisites

#include <tiny_dnn/tiny_dnn.h>

1.Easy to introduce- You can bring Deep Learning to any target you have a C++ compiler

- Officially supported (by CI builds):

- Windows (msvc2013 32/64bit, msvc2015 32/64bit)

- Linux (gcc4.9, clang3.5)

- OSX(LLVM 7.3)

- tiny-dnn might run on other compiler that support C++11

1.Easy to introduce- Caffe model converter is also available

- TensorFlow converter - coming soon!

- Close the gap between researcher and engineer

1.Easy to introduce

2.Simple syntax

2.Simple syntaxExample: Multi layer perceptron

Caffe prototxt

input: "data"input_shape { dim: 1 dim: 1 dim: 1 dim: 20}layer { name: "ip1" type: "InnerProduct" inner_product_param { num_output: 100 } bottom: "ip1" top: "ip2"}layer { name: "a1" type: "TanH" bottom: "ip1" top: "ip1"}layer { name: "ip2" type: "InnerProduct" inner_product_param { num_output: 10 } bottom: "ip1" top: "out"}layer { name: "a1" type: "TanH" bottom: "out" top: "out"}

Tensorflow

w1 = tf.Variable(tf.random_normal([10, 100]))w2 = tf.Variable(tf.random_normal([100, 20]))b1 = tf.Variable(tf.random_normal([100]))b2 = tf.Variable(tf.random_normal([20]))

layer1 = tf.add(tf.matmul(x, w1), b1)layer1 = tf.nn.relu(layer1)layer2 = tf.add(tf.matmul(x, w2), b2)layer2 = tf.nn.relu(layer2)

model = Sequential([ Dense(100, input_dim=10), Activation('relu'), Dense(20), Activation('relu'),])

tiny-dnn

network<sequential> net;net << dense<relu>(10, 100) << dense<relu>(100, 20);

tiny-dnn, another solution

auto net = make_mlp<relu>({10, 100, 20});

- modern C++ enable us to keep code simple- type inference, initializer list

2.Simple syntaxExample: Convolutional Neural Networks

Caffe prototxt

name: "LeNet"layer { name: "data" type: "Input" top: "data" input_param { shape: { dim: 64 dim: 1 dim: 28 dim: 28 } }}layer { name: "conv1" type: "Convolution" bottom: "data" top: "conv1" param { lr_mult: 1 } param { lr_mult: 2 } convolution_param { num_output: 20 kernel_size: 5 stride: 1 weight_filler { type: "xavier" } bias_filler { type: "constant" } }}layer { name: "pool1" type: "Pooling" bottom: "conv1" top: "pool1" pooling_param { pool: MAX kernel_size: 2 stride: 2 }}layer { name: "conv2" type: "Convolution" bottom: "pool1" top: "conv2" param { lr_mult: 1 } param { lr_mult: 2 } convolution_param { num_output: 50 kernel_size: 5 stride: 1 weight_filler { type: "xavier" } bias_filler { type: "constant" } }}layer { name: "pool2" type: "Pooling" bottom: "conv2" top: "pool2" pooling_param { pool: MAX kernel_size: 2 stride: 2 }}layer { name: "ip1" type: "InnerProduct" bottom: "pool2" top: "ip1" param { lr_mult: 1 } param { lr_mult: 2 } inner_product_param { num_output: 500 weight_filler { type: "xavier" } bias_filler { type: "constant" } }}layer { name: "relu1" type: "ReLU" bottom: "ip1" top: "ip1"}layer { name: "ip2" type: "InnerProduct" bottom: "ip1" top: "ip2" param { lr_mult: 1 } param { lr_mult: 2 } inner_product_param { num_output: 10 weight_filler { type: "xavier" } bias_filler { type: "constant" } }}layer { name: "prob" type: "Softmax" bottom: "ip2" top: "prob"}

}}layer { name: "conv2" type: "Convolution" bottom: "pool1" top: "conv2" param { lr_mult: 1 } param { lr_mult: 2 } convolution_param { num_output: 50 kernel_size: 5 stride: 1 weight_filler { type: "xavier" } bias_filler { type: "constant" } }}layer { name: "pool2" type: "Pooling" bottom: "conv2" top: "pool2" pooling_param { pool: MAX kernel_size: 2 stride: 2 }}layer { name: "ip1" type: "InnerProduct" bottom: "pool2" top: "ip1" param { lr_mult: 1 } param { lr_mult: 2 } inner_product_param { num_output: 500 weight_filler { type: "xavier" } bias_filler { type: "constant" } }}layer { name: "relu1" type: "ReLU" bottom: "ip1" top: "ip1"}layer { name: "ip2" type: "InnerProduct" bottom: "ip1" top: "ip2" param { lr_mult: 1 } param { lr_mult: 2 } inner_product_param { num_output: 10 weight_filler { type: "xavier" } bias_filler { type: "constant" } }}layer { name: "prob" type: "Softmax" bottom: "ip2" top: "prob"}

param { lr_mult: 1 } param { lr_mult: 2 } inner_product_param { num_output: 500 weight_filler { type: "xavier" } bias_filler { type: "constant" } }}layer { name: "relu1" type: "ReLU" bottom: "ip1" top: "ip1"}layer { name: "ip2" type: "InnerProduct" bottom: "ip1" top: "ip2" param { lr_mult: 1 } param { lr_mult: 2 } inner_product_param { num_output: 10 weight_filler { type: "xavier" } bias_filler { type: "constant" } }}layer { name: "prob" type: "Softmax" bottom: "ip2" top: "prob"}

}}layer { name: "relu1" type: "ReLU" bottom: "ip1" top: "ip1"}layer { name: "ip2" type: "InnerProduct" bottom: "ip1" top: "ip2" param { lr_mult: 1 } param { lr_mult: 2 } inner_product_param { num_output: 10 weight_filler { type: "xavier" } bias_filler { type: "constant" } }}layer { name: "prob" type: "Softmax" bottom: "ip2" top: "prob"}

Tensorflow

x = tf.Variable(tf.random_normal([-1, 28, 28, 1]))wc1 = tf.Variable(tf.random_normal([5, 5, 1, 32]))wc2 = tf.Variable(tf.random_normal([5, 5, 32, 64]))wd1 = tf.Variable(tf.random_normal([7*7*64, 1024]))wout = tf.Variable(tf.random_normal([1024, n_classes]))bc1 = tf.Variable(tf.random_normal([32]))bc2 = tf.Variable(tf.random_normal([64]))bd1 = tf.Variable(tf.random_normal([1024]))bout = tf.Variable(tf.random_normal([n_classes]))

conv1 = conv2d(x, wc1, bc1)conv1 = maxpool2d(conv1, k=2)conv1 = tf.nn.relu(conv1)conv2 = conv2d(conv1, wc2, bc2)conv2 = maxpool2d(conv2, k=2)conv2 = tf.nn.relu(conv2)fc1 = tf.reshape(conv2, [-1, wd1.get_shape().as_list()[0]])fc1 = tf.add(tf.matmul(fc1, wd1), bd1)fc1 = tf.nn.relu(fc1)fc1 = tf.nn.dropout(fc1, dropout)out = tf.add(tf.matmul(fc1, wout), bout)

model = Sequential([ Convolution2D(32, 5, 5, input_shape=[28,28,5]), MaxPooling2D(pool_size=2), Activation('relu'), Convolution2D(64, 5, 5), MaxPooling2D(pool_size=2), Activation('relu'), Dense(1024), Dropout(0.5), Dense(10),])

tiny-dnn

network<sequential> net;net << conv<>(28, 28, 5, 1, 32) << max_pool<relu>(24, 24, 2) << conv<>(12, 12, 5, 32, 64) << max_pool<relu>(8, 8, 64, 2) << fc<relu>(4*4*64, 1024) << dropout(1024, 0.5f) << fc<>(1024, 10);

1.Easy to introduce

2.Simple syntax

3.Extensible backendsCommon scenario1:

“We have a good GPU machine to train networks, but we need to deploy trained model into mobile device”

Common scenario2:

“We need to write platform-specific code to get production-level performance... but it’s painful to understand whole framework”

3.Extensible backendsSome performance critical layers have backend engine

Layer API

backend::internal

pure-c++ code

backend::avx

avx-optimized code …backend::nnpack

x86/ARM

backend::opencl

Optional

// select an engine explicitlynet << conv<>(28, 28, 5, 1, 32, backend::avx) << ...;

// switch them seamlesslynet[0]->set_backend_type(backend::opencl);

Model serialization (binary/json)

Regression training

Basic image processing

Layer freezing

Graph visualization

Multi-thread execution

Double precision support

Basic functionality

Caffe importer (requires protobuf)

OpenMP support

Intel TBB support

NNPACK backend (same to caffe2)

libdnn backend (same to caffe-opencl)Extra modules

(requires 3rd-party libraries)

Future plans

- GPU integration- GPU backend is still experimental- cudnn backend

- More mobile-oriented- iOS/Android examples- Quantized operation for less RAM

- TensorFlow Importer- Performance profiling tools- OpenVX support

We need your help!

User chat for QA:

https://gitter.im/tiny-dnn

Official documents:

http://tiny-dnn.readthedocs.io/en/latest/

For users

Join our developer chat:

https://gitter.im/tiny-dnn/developers

Check out docs, and our issues marked as “contributions welcome”:

https://github.com/tiny-dnn/tiny-dnn/blob/master/docs/developer_guides/How-to-contribute.mdhttps://github.com/tiny-dnn/tiny-dnn/labels/contributions%20welcome

For developers

code: github.com/tiny-dnn/tiny-dnnslide: https://goo.gl/Se2rzu

Deep learning with C++ - an introduction to tiny-dnn

Engineering

Transcript of Deep learning with C++ - an introduction to tiny-dnn

SAFE-DNN: A DEEP NEURAL NETWORK WITH SPIKE A F …

Structure Discovery of Deep Neural Network Based on ... · DNN structure and parameters using evolutionary algorithms. The proposed approach parametrizes the DNN structure by a directed

In-memory computing with emerging memory devices · In-memory computing with emerging memory devices 5 IUNET days 2017 In-memory logic Deep learning ... Deep neural networks (DNN)

Tiny-Inception-ResNet-v2: Using Deep Learning for Eliminating Bonded Labors …openaccess.thecvf.com/content_CVPRW_2019/papers/cv4gc/... · 2019-06-10 · Tiny-Inception-ResNet-v2:

FireCaffe: near-linear acceleration of deep neural network ...Accelerated DNN training solutions would solve a major pain point for these companies. 1.2. Real-Time DNN Training So

MCUNet: Tiny Deep Learning on IoT Devices

Deep learning with C++ - an introduction to tiny-dnn

DEEP LEARNING NEURAL NETWORK APPROACHES TO LAND …Deep Neural Network (DNN)-based traffic prediction model was developed using the City of Calgary’s datasets. Specifically, the

Tiny Hand Gesture Recognition without Localization via a ...oa.upm.es/50817/1/INVE_MEM_2017_268387.pdf · Tiny Hand Gesture Recognition without Localization via a Deep Convolutional

Dnn Tutorial

A study on Image Classification based on Deep Learning and ... · deep neural network (DNN) or also known as Deep Learning by using framework TensorFlow. Python is used as a programming

Technical Communication and DNN - DotNetNuke - DNN Connect 2014

Deep Neural Network (DNN) Perspective On Atmospheric ......• DNN method can predict the winds in full coverage, whereas missing data points are present in the traditional method.

Basics of Numerical Optimization: Preliminaries · =)replace Hby DNN W, i.e., a deep neural network with weights W { Optimization: min W X i ‘(y i;DNN W (x i)) + (W) { Generalization:

swCaffe: a Parallel Framework for Accelerating Deep ... · C. DNN Training Process and Frameworks Deep learning is used to solve the following optimization problem. argmin f( ) =

Deep Neural Network Acceleration in Non-Volatile …...quirement in Deep Convolutional Neural Networks (DNN), re searchers have proposed various quantized/binarized DNNs by limiting

Scalable and Distributed DNN Training on Modern HPC ...hidl.cse.ohio-state.edu/...distributed_training_dk.pdf · (2) Deep Learning @Scale (3) Non-deep learning analytics @Scale (4)

Evaluating Scalable Bayesian Deep Learning Methods for ... · Evaluating Scalable Bayesian Deep Learning Methods for Robust Computer Vision tions, and employs state-of-the-art DNN

DNN ENGINE: A 16nm Sub-uJ Deep Neural Network ......DNN ENGINE: A 16nm Sub -uJ DNN Inference Accelerator for the Embedded Masses Paul N. Whatmough1,2 S. K. Lee 2, N. Mulholland , P.

TASO: Optimizing Deep Learning Computation with Automatic ...odedp/taso-sosp19.pdf · Existing deep neural network (DNN) frameworks optimize ... Optimizing Deep Learning Computation