Deep learning with C++ - an introduction to tiny-dnn

Post on 21-Apr-2017

4.932 views 2 download

Transcript of Deep learning with C++ - an introduction to tiny-dnn

deep learning with c++an introduction to tiny-dnn

by Taiga Nomiembedded software engineer, Osaka, Japan

deep learning

Icons made by Freepik from www.flaticon.com is licensed by CC 3.0 BY

Facial recognition

Image understanding

Finance

Game playing

Translation

Robotics

Drug discovery

Text recognition

Video processing

Text generation

Deep learning- Learning a complicated function from many data

- Composed of trainable, simple mathematical functions

Input OutputTrainable Building Blocks

- text

- audio

- image

- video

- ...

- text

- audio

- image

- video

- ...

deep learning framework

A modern deep learning framework for C++ programmers

1400 stars 500 forks 35 contributors 100 clones/day

“A Modern Deep Learning module” by Edgar Riba

“Deep Learning with Quantization for Semantic Saliency Detection” by Yida Wang

https://summerofcode.withgoogle.com/archive/

1.Easy to introduce

2.Simple syntax

3.Extensible backends

1.Easy to introduce- Just put the following line into your cpp

tiny-dnn is header only - No installation

tiny-dnn is dependency-free - No prerequisites

#include <tiny_dnn/tiny_dnn.h>

1.Easy to introduce- You can bring Deep Learning to any target you have a C++ compiler

- Officially supported (by CI builds):

- Windows (msvc2013 32/64bit, msvc2015 32/64bit)

- Linux (gcc4.9, clang3.5)

- OSX(LLVM 7.3)

- tiny-dnn might run on other compiler that support C++11

1.Easy to introduce- Caffe model converter is also available

- TensorFlow converter - coming soon!

- Close the gap between researcher and engineer

1.Easy to introduce

2.Simple syntax

3.Extensible backends

2.Simple syntaxExample: Multi layer perceptron

Caffe prototxt

input: "data"input_shape { dim: 1 dim: 1 dim: 1 dim: 20}layer { name: "ip1" type: "InnerProduct" inner_product_param { num_output: 100 } bottom: "ip1" top: "ip2"}layer { name: "a1" type: "TanH" bottom: "ip1" top: "ip1"}layer { name: "ip2" type: "InnerProduct" inner_product_param { num_output: 10 } bottom: "ip1" top: "out"}layer { name: "a1" type: "TanH" bottom: "out" top: "out"}

Tensorflow

w1 = tf.Variable(tf.random_normal([10, 100]))w2 = tf.Variable(tf.random_normal([100, 20]))b1 = tf.Variable(tf.random_normal([100]))b2 = tf.Variable(tf.random_normal([20]))

layer1 = tf.add(tf.matmul(x, w1), b1)layer1 = tf.nn.relu(layer1)layer2 = tf.add(tf.matmul(x, w2), b2)layer2 = tf.nn.relu(layer2)

Keras

model = Sequential([ Dense(100, input_dim=10), Activation('relu'), Dense(20), Activation('relu'),])

tiny-dnn

network<sequential> net;net << dense<relu>(10, 100) << dense<relu>(100, 20);

tiny-dnn, another solution

auto net = make_mlp<relu>({10, 100, 20});

- modern C++ enable us to keep code simple- type inference, initializer list

2.Simple syntaxExample: Convolutional Neural Networks

Caffe prototxt

name: "LeNet"layer { name: "data" type: "Input" top: "data" input_param { shape: { dim: 64 dim: 1 dim: 28 dim: 28 } }}layer { name: "conv1" type: "Convolution" bottom: "data" top: "conv1" param { lr_mult: 1 } param { lr_mult: 2 } convolution_param { num_output: 20 kernel_size: 5 stride: 1 weight_filler { type: "xavier" } bias_filler { type: "constant" } }}layer { name: "pool1" type: "Pooling" bottom: "conv1" top: "pool1" pooling_param { pool: MAX kernel_size: 2 stride: 2 }}layer { name: "conv2" type: "Convolution" bottom: "pool1" top: "conv2" param { lr_mult: 1 } param { lr_mult: 2 } convolution_param { num_output: 50 kernel_size: 5 stride: 1 weight_filler { type: "xavier" } bias_filler { type: "constant" } }}layer { name: "pool2" type: "Pooling" bottom: "conv2" top: "pool2" pooling_param { pool: MAX kernel_size: 2 stride: 2 }}layer { name: "ip1" type: "InnerProduct" bottom: "pool2" top: "ip1" param { lr_mult: 1 } param { lr_mult: 2 } inner_product_param { num_output: 500 weight_filler { type: "xavier" } bias_filler { type: "constant" } }}layer { name: "relu1" type: "ReLU" bottom: "ip1" top: "ip1"}layer { name: "ip2" type: "InnerProduct" bottom: "ip1" top: "ip2" param { lr_mult: 1 } param { lr_mult: 2 } inner_product_param { num_output: 10 weight_filler { type: "xavier" } bias_filler { type: "constant" } }}layer { name: "prob" type: "Softmax" bottom: "ip2" top: "prob"}

}}layer { name: "conv2" type: "Convolution" bottom: "pool1" top: "conv2" param { lr_mult: 1 } param { lr_mult: 2 } convolution_param { num_output: 50 kernel_size: 5 stride: 1 weight_filler { type: "xavier" } bias_filler { type: "constant" } }}layer { name: "pool2" type: "Pooling" bottom: "conv2" top: "pool2" pooling_param { pool: MAX kernel_size: 2 stride: 2 }}layer { name: "ip1" type: "InnerProduct" bottom: "pool2" top: "ip1" param { lr_mult: 1 } param { lr_mult: 2 } inner_product_param { num_output: 500 weight_filler { type: "xavier" } bias_filler { type: "constant" } }}layer { name: "relu1" type: "ReLU" bottom: "ip1" top: "ip1"}layer { name: "ip2" type: "InnerProduct" bottom: "ip1" top: "ip2" param { lr_mult: 1 } param { lr_mult: 2 } inner_product_param { num_output: 10 weight_filler { type: "xavier" } bias_filler { type: "constant" } }}layer { name: "prob" type: "Softmax" bottom: "ip2" top: "prob"}

param { lr_mult: 1 } param { lr_mult: 2 } inner_product_param { num_output: 500 weight_filler { type: "xavier" } bias_filler { type: "constant" } }}layer { name: "relu1" type: "ReLU" bottom: "ip1" top: "ip1"}layer { name: "ip2" type: "InnerProduct" bottom: "ip1" top: "ip2" param { lr_mult: 1 } param { lr_mult: 2 } inner_product_param { num_output: 10 weight_filler { type: "xavier" } bias_filler { type: "constant" } }}layer { name: "prob" type: "Softmax" bottom: "ip2" top: "prob"}

}}layer { name: "relu1" type: "ReLU" bottom: "ip1" top: "ip1"}layer { name: "ip2" type: "InnerProduct" bottom: "ip1" top: "ip2" param { lr_mult: 1 } param { lr_mult: 2 } inner_product_param { num_output: 10 weight_filler { type: "xavier" } bias_filler { type: "constant" } }}layer { name: "prob" type: "Softmax" bottom: "ip2" top: "prob"}

Tensorflow

x = tf.Variable(tf.random_normal([-1, 28, 28, 1]))wc1 = tf.Variable(tf.random_normal([5, 5, 1, 32]))wc2 = tf.Variable(tf.random_normal([5, 5, 32, 64]))wd1 = tf.Variable(tf.random_normal([7*7*64, 1024]))wout = tf.Variable(tf.random_normal([1024, n_classes]))bc1 = tf.Variable(tf.random_normal([32]))bc2 = tf.Variable(tf.random_normal([64]))bd1 = tf.Variable(tf.random_normal([1024]))bout = tf.Variable(tf.random_normal([n_classes]))

conv1 = conv2d(x, wc1, bc1)conv1 = maxpool2d(conv1, k=2)conv1 = tf.nn.relu(conv1)conv2 = conv2d(conv1, wc2, bc2)conv2 = maxpool2d(conv2, k=2)conv2 = tf.nn.relu(conv2)fc1 = tf.reshape(conv2, [-1, wd1.get_shape().as_list()[0]])fc1 = tf.add(tf.matmul(fc1, wd1), bd1)fc1 = tf.nn.relu(fc1)fc1 = tf.nn.dropout(fc1, dropout)out = tf.add(tf.matmul(fc1, wout), bout)

Keras

model = Sequential([ Convolution2D(32, 5, 5, input_shape=[28,28,5]), MaxPooling2D(pool_size=2), Activation('relu'), Convolution2D(64, 5, 5), MaxPooling2D(pool_size=2), Activation('relu'), Dense(1024), Dropout(0.5), Dense(10),])

tiny-dnn

network<sequential> net;net << conv<>(28, 28, 5, 1, 32) << max_pool<relu>(24, 24, 2) << conv<>(12, 12, 5, 32, 64) << max_pool<relu>(8, 8, 64, 2) << fc<relu>(4*4*64, 1024) << dropout(1024, 0.5f) << fc<>(1024, 10);

1.Easy to introduce

2.Simple syntax

3.Extensible backends

3.Extensible backendsCommon scenario1:

“We have a good GPU machine to train networks, but we need to deploy trained model into mobile device”

Common scenario2:

“We need to write platform-specific code to get production-level performance... but it’s painful to understand whole framework”

3.Extensible backendsSome performance critical layers have backend engine

Layer API

backend::internal

pure-c++ code

backend::avx

avx-optimized code …backend::nnpack

x86/ARM

backend::opencl

GPU

Optional

3.Extensible backends

// select an engine explicitlynet << conv<>(28, 28, 5, 1, 32, backend::avx) << ...;

// switch them seamlesslynet[0]->set_backend_type(backend::opencl);

Model serialization (binary/json)

Regression training

Basic image processing

Layer freezing

Graph visualization

Multi-thread execution

Double precision support

Basic functionality

Caffe importer (requires protobuf)

OpenMP support

Intel TBB support

NNPACK backend (same to caffe2)

libdnn backend (same to caffe-opencl)Extra modules

(requires 3rd-party libraries)

Future plans

- GPU integration- GPU backend is still experimental- cudnn backend

- More mobile-oriented- iOS/Android examples- Quantized operation for less RAM

- TensorFlow Importer- Performance profiling tools- OpenVX support

We need your help!

User chat for QA:

https://gitter.im/tiny-dnn

Official documents:

http://tiny-dnn.readthedocs.io/en/latest/

For users

Join our developer chat:

https://gitter.im/tiny-dnn/developers

or

Check out docs, and our issues marked as “contributions welcome”:

https://github.com/tiny-dnn/tiny-dnn/blob/master/docs/developer_guides/How-to-contribute.mdhttps://github.com/tiny-dnn/tiny-dnn/labels/contributions%20welcome

For developers

code: github.com/tiny-dnn/tiny-dnnslide: https://goo.gl/Se2rzu