Kneron & AI SoC (KDP520 NPU) · After graduation, he worked in world class ... World-class R&D...

27
Kneron & AI SoC (KDP520 NPU) Sep. 2018

Transcript of Kneron & AI SoC (KDP520 NPU) · After graduation, he worked in world class ... World-class R&D...

Page 1: Kneron & AI SoC (KDP520 NPU) · After graduation, he worked in world class ... World-class R&D team, who graduated from UCLA, MIT, Cornell, NTHU, NTU and so on, and worked in technology

Kneron &

AI SoC (KDP520 NPU)

Sep. 2018

Page 2: Kneron & AI SoC (KDP520 NPU) · After graduation, he worked in world class ... World-class R&D team, who graduated from UCLA, MIT, Cornell, NTHU, NTU and so on, and worked in technology

Company Profile

Page 3: Kneron & AI SoC (KDP520 NPU) · After graduation, he worked in world class ... World-class R&D team, who graduated from UCLA, MIT, Cornell, NTHU, NTU and so on, and worked in technology

Kneron, established in San Diego in 2015, is a leading provider of edge AI solutions.

We focus on designing integrated software and hardware to enable edge side AI

solutions for smart home, smart surveillance, smartphone, robot, drone, and IoT

devices

A Global AI Company

Kneron Company Profile

Page 4: Kneron & AI SoC (KDP520 NPU) · After graduation, he worked in world class ... World-class R&D team, who graduated from UCLA, MIT, Cornell, NTHU, NTU and so on, and worked in technology

Founder and Core Team

Founder & CEO, Albert Liu

Albert Liu studied at UC Berkeley/UCLA/UCSD joint

research master and doctoral degree, and got UCLA EE

doctoral degree. After graduation, he worked in world class

enterprises including Qualcomm, Samsung, MStar, and

Wireless Info.

Team

World-class R&D team, who graduated from UCLA, MIT,

Cornell, NTHU, NTU and so on, and worked in technology

leading companies including Google, Microsoft, Bell

Laboratories, IBM, and Intel. This team accumulated more

than 10 years knowledge in AI, computer vision, and image

processing.

Page 5: Kneron & AI SoC (KDP520 NPU) · After graduation, he worked in world class ... World-class R&D team, who graduated from UCLA, MIT, Cornell, NTHU, NTU and so on, and worked in technology

Academic Contribution

Published Papers Publication Difficulty

A 16-Gb/s 14.7-mW Tri-Band Cognitive Serial Link Transmitter With Forwarded Clock to Enable PAM-16/256-QAM and Channel Response Detection

IEEE Journal of Solid State Circuits 1

A 2.3mW 11cm Range Bootstrapped and Correlated Double Sampling (BCDS) 3D Touch Sensor for Mobile Devices IEEE International Solid-State Circuits Conference 2

A Reconfigurable Streaming Deep Convolutional Neural Network Accelerator for Internet of Things IEEE Transactions on Circuits and Systems I 4

Invited — A 2.2 GHz SRAM with high temperature variation immunity for deep learning application under 28nm IEEE/ACM Design Automation Conference, June, 2016 5

Invited - Airtouch: A Novel Single Layer 3D Touch Sensing System for Human/Mobile Device Interactions IEEE/ACM Design Automation Conference, June, 2016 5

A LEGO-Like Architecture for Scalable Deep Learning Accelerator IEEE/ACM Design Automation Conference, June, 2017 5

A Novel Fully Synthesizable All-Digital RF Transmitter for IoT ApplicationsIEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems

10

◼ Patent:Original AI algorithm, neural network software, hardware core design, applied ~20 patents in 4 countries

◼ Excellent research team: Published research result in multiple international top raking conference, including ISSCC in which only 19 China papers are approved

Page 6: Kneron & AI SoC (KDP520 NPU) · After graduation, he worked in world class ... World-class R&D team, who graduated from UCLA, MIT, Cornell, NTHU, NTU and so on, and worked in technology

Partner and Customer

Page 7: Kneron & AI SoC (KDP520 NPU) · After graduation, he worked in world class ... World-class R&D team, who graduated from UCLA, MIT, Cornell, NTHU, NTU and so on, and worked in technology

Core team: top engineers from Qualcomm, Intel, Bell Labs and more

Albert LiuPhD, Founder & CEOUCLA EE PhD,IEEE senior member

Previously worked in Samsung on Galaxy

development, in Qualcomm on AI product, in Bell Lab

and NASA JPL;Numerous publications and patents

in machine learning.

Roger LiuPhD, COOStony Brook University EE PhD

7 patents, 20 years experience, previously

worked in Sentelic Corp as VP of engineering, in

Milkywaysilicon Technology as COO / CTO

Frank Chang PhD, Co-FounderNational Chiao-Tung University PhD

UCLA EE Professor and the Chairman. National

Academy of Inventors Fellow. National Academy of

Engineering Fellow. Elected as the 11th president

of National Chiao-Tung University in Hsinchu,

Taiwan in 2014. Awarded the IET JJ Thomson

Medal in.

Stony Brook University EE PhD

70+ patents, 20+ years experience, previously

worked in Qualcomm as Director on MM system,

architecture and algorithm design, in Spreadtrum,

HuaWei, Vivo as AVP of MM system engineering.

HsiangTsun LiPhD, CSO

Page 8: Kneron & AI SoC (KDP520 NPU) · After graduation, he worked in world class ... World-class R&D team, who graduated from UCLA, MIT, Cornell, NTHU, NTU and so on, and worked in technology

Edge AI

Page 9: Kneron & AI SoC (KDP520 NPU) · After graduation, he worked in world class ... World-class R&D team, who graduated from UCLA, MIT, Cornell, NTHU, NTU and so on, and worked in technology

Benefits of edge AI

Consumers:Maintain privacy

Response in real-time

Chip & device makers:

Save bandwidth

Increase stability

Scalable business model

Predictable long-term cost

Page 10: Kneron & AI SoC (KDP520 NPU) · After graduation, he worked in world class ... World-class R&D team, who graduated from UCLA, MIT, Cornell, NTHU, NTU and so on, and worked in technology

NPU enables AI on edge devicesMake possible for power consumption, throughput, and cost

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

1000

100

10

1

0.1

0.01

Processor Number(sorted by efficiency)

ener

gy

eff

icie

ncy

(MO

PS

/Mw

)

More efficient

More flexible

Microprocessors

DSPs

Dedicated HW

100x

10x

Source: National Science Foundation (NSF)

Page 11: Kneron & AI SoC (KDP520 NPU) · After graduation, he worked in world class ... World-class R&D team, who graduated from UCLA, MIT, Cornell, NTHU, NTU and so on, and worked in technology

Target segments

Smart Phone Smart Home Surveillance

Page 12: Kneron & AI SoC (KDP520 NPU) · After graduation, he worked in world class ... World-class R&D team, who graduated from UCLA, MIT, Cornell, NTHU, NTU and so on, and worked in technology

Core technology

Page 13: Kneron & AI SoC (KDP520 NPU) · After graduation, he worked in world class ... World-class R&D team, who graduated from UCLA, MIT, Cornell, NTHU, NTU and so on, and worked in technology

Kneron AI Turkey SolutionOne stop service aiming for fastest time to market

• Mainstream CNN networks• 2D /3D Vision applications• Audio applications

• Lossless / lossy compression• Jointly work with hardware• Reduce model size and computation cost

• Smart Phone KDP3xx series• Smart Home KDP5xx series• Surveillance KDP7xx series

Commercialized CNN Model

CNN Accelerator (NPU) IP

Model Compression TechOriginal

Lossy

Lossless

Compressed

Page 14: Kneron & AI SoC (KDP520 NPU) · After graduation, he worked in world class ... World-class R&D team, who graduated from UCLA, MIT, Cornell, NTHU, NTU and so on, and worked in technology

AI SoC (KDP520 NPU)

Page 15: Kneron & AI SoC (KDP520 NPU) · After graduation, he worked in world class ... World-class R&D team, who graduated from UCLA, MIT, Cornell, NTHU, NTU and so on, and worked in technology

• KDP520 NPU (400MHz)

• Dual Cortex M4 (200MHz for sys & 250MHz for AI)

• 512KB SRAM

• 32MB / 64MB DRAM

• Process: 40nm

• Die size: 4000 x 4000

• Power: 200mW

AI SoC high level block

KDP520 NPU

Cortex M4(AI co-processor)

Cortex M4(System control)

OTPSRAM

USB 2.0

MIPI CSI RX

MIPI CSI TX

MIPI DSI TX

DVP VI

DVP VO ADC

I2S

I2C

UART

SPI

Page 16: Kneron & AI SoC (KDP520 NPU) · After graduation, he worked in world class ... World-class R&D team, who graduated from UCLA, MIT, Cornell, NTHU, NTU and so on, and worked in technology

KDP520 Spec and PPA

IP Version KDP520

Specifiction

16-bit MAC 144

8-bit MAC 576

Arithmetic Precision 8/16-bit fixed-point

Support CNN layer kernel size 1 to 9

Support Channel/Feature Number up to 1024

Support Max pooling/Average pooling Function with kernel size 2,3

SD & Frame Buffer NA

PPA

Process Node UMC 40nm LP

Maximum Clock Frequency (MHz) 400

NPU SRAM Size 512K

Operating Logic Power (mW)200

Operating SRAM Power (mW)

Page 17: Kneron & AI SoC (KDP520 NPU) · After graduation, he worked in world class ... World-class R&D team, who graduated from UCLA, MIT, Cornell, NTHU, NTU and so on, and worked in technology

Computing MAC Efficiency

⚫ The computing MAC efficiency is evaluated as data bandwidth is high enough and SRAM size is large enough

⚫ ResNet50 (~3x gain)• Ours: 73%• Others: 23.16%

⚫ GoogLeNet(~1.7x gain)• Ours: 74%• Others: 43.19%

⚫ Reach 70-90% compute efficiency across most networks

Model Input Size MAC # (x 109) in Model

Total MAC # (x 109) in IP (including idle MAC)

Computing MAC Efficiency (Avg. vs. Peak Throughput)

VGG16 224x224 15.5 16.2 96%

GoogLeNet 224x224 1.59 2.13 74%

ResNet 34 224x224 3.67 4.55 81%

ResNet 50 224x224 3.87 5.28 73%

ResNet 101 224x224 7.59 9.78 78%

ResNet 152 224x224 11.3 14.3 79%

ResNext 224x224 691 849 81%

MobileNet V1 224x224 0.57 0.73 77%

MobileNet V2 224x224 0.28 0.37 76%

MobileNet + SSD

640x480 3.85 4.85 79%

Yolo V2 640x480 26.2 27.8 94%

Tiny Yolo 416x416 3.49 4.15 84%

DenseNet 224x224 13.9 14.2 98%

Page 18: Kneron & AI SoC (KDP520 NPU) · After graduation, he worked in world class ... World-class R&D team, who graduated from UCLA, MIT, Cornell, NTHU, NTU and so on, and worked in technology

Applications

Page 19: Kneron & AI SoC (KDP520 NPU) · After graduation, he worked in world class ... World-class R&D team, who graduated from UCLA, MIT, Cornell, NTHU, NTU and so on, and worked in technology

Applications

01

02

03

Low cost 3D face recognition(with regular RGB + NIR camera)

3D sensing & 3D recognition(with 3rd party 3D camera module)

Image & voice recognition

04 AI companion chip

Page 20: Kneron & AI SoC (KDP520 NPU) · After graduation, he worked in world class ... World-class R&D team, who graduated from UCLA, MIT, Cornell, NTHU, NTU and so on, and worked in technology

1-1. 3D sensing(with 3rd party 3D camera module)

KDP520 NPU

Cortex M4

Cortex M4

OTPSRAM

Active dual cam module

To display

Stereo camera module

FeaturesBody / object detection

Distance / depth detection

ScenariosRobot, drone, AR, Door lock

Page 21: Kneron & AI SoC (KDP520 NPU) · After graduation, he worked in world class ... World-class R&D team, who graduated from UCLA, MIT, Cornell, NTHU, NTU and so on, and worked in technology

1-2. 3D recognition(with 3rd party 3D camera module)

KDP520 NPU

Cortex M4

Cortex M4

OTPSRAM

Dot projector, NIR LED

Depth decoder

Structured light module

NIR camera

FeaturesFace detection

3D face recognition

Liveness detection

ScenariosSmartphone, door lock, IoT devices

Page 22: Kneron & AI SoC (KDP520 NPU) · After graduation, he worked in world class ... World-class R&D team, who graduated from UCLA, MIT, Cornell, NTHU, NTU and so on, and worked in technology

2. Low cost 3D face recognition(with regular camera module)

KDP520 NPU

Cortex M4

Cortex M4

OTPSRAM

NIR camera & LED

RGB camera(with ISP)

FeaturesFace detection

3D face recognition

Liveness detection

ScenariosDoor lock

Door bell

Smartphone

To display

KDP520 NPU

Cortex M4

Cortex M4

OTPSRAM

NIR camera & LED

2nd NIR camera

Case 2

Case 1

Page 23: Kneron & AI SoC (KDP520 NPU) · After graduation, he worked in world class ... World-class R&D team, who graduated from UCLA, MIT, Cornell, NTHU, NTU and so on, and worked in technology

3. Image & voice recognition

KDP520 NPU

Cortex M4

Cortex M4

OTPSRAM

Features2D Face detection / recognition

Landmark / body / gesture / object detection

Voice wakeup

Keyword recognition

Voiceprint recognition

ScenariosWeb camera / Smart speaker / Home appliance

RGB camera(with ISP)

Microphone array

Page 24: Kneron & AI SoC (KDP520 NPU) · After graduation, he worked in world class ... World-class R&D team, who graduated from UCLA, MIT, Cornell, NTHU, NTU and so on, and worked in technology

Caffe Keras Tensorflow

Frontend

Converter

ONNX

Graph Optimizer

Operation Optimizer

Kneron IP Control Generator

weight.bin Cmd.bin Setup.bin

⚫ Translate models of various framework to Kneron IP’s

corresponding instructions, weight file, and data flow controls

⚫ Front-end converter

⚫ Compiler

• Input

• ONNX model

• Output

• command.bin - command for input model

• setup.bin - firmware management variable

• weight.bin - weight from input model

• Model inference in hardware based on compiler

outputs

4. Compiler flow for AI companion chip

Page 25: Kneron & AI SoC (KDP520 NPU) · After graduation, he worked in world class ... World-class R&D team, who graduated from UCLA, MIT, Cornell, NTHU, NTU and so on, and worked in technology

SoC Roadmap

Page 26: Kneron & AI SoC (KDP520 NPU) · After graduation, he worked in world class ... World-class R&D team, who graduated from UCLA, MIT, Cornell, NTHU, NTU and so on, and worked in technology

2019 2020

KDP 520

• Image & voice recog

• Low power

• Engineer sample in Q2 2019

Image & Voice AI sensing

KDP 530

• Video processing & recog

(ISP, video encoder)

• Audio processing & recog

(mic array processing)

Integrated AI sensing hub

KDP 720

• Heterogeneous structure

• HW de-compression

• Engineer sample in Q3 2019

AI surveillance on edge

KDP 730

• Cascade structure

• Ultimate computing power

Edge AI server hub

KDP 330

AI sensing for wearable

• Small die size

• Ultra low power (5~10mW)

Page 27: Kneron & AI SoC (KDP520 NPU) · After graduation, he worked in world class ... World-class R&D team, who graduated from UCLA, MIT, Cornell, NTHU, NTU and so on, and worked in technology

Thank

You AI Everywhere

www.kneron.com