Tensor what? - Luke Sleemanlukesleeman.com.au/wp-content/uploads/2017/02/AI-with-notes.pdf ·...

Tensor what?An introduction to AI on mobile

Luke Sleeman - Freelance Android developer

http://lukesleeman.com [email protected]

@LukeSleeman

Good morning, my name is Luke Sleeman, I’m a freelance Android developer.Today, I will be giving my presentation titled “Tensor what? An introduction to AI on mobile”

http://lukesleeman.com

Agenda• Intro - History of AI, Recent breakthroughs, why AI on mobile is important

• Part 1 - Key AI tech - Neural networks, image recognition, voice recognition, and Siri work

• Part 2 - TensorFlow - What TensorFlow does, Server side use, embedding TensorFlow in a mobile app, models built with TensorFlow

• Part 3 - Demo - We recognise a banana with my phone!

• Closing thoughts - Strong AI, Why all this is important.

• Questions

AI can be an extremely dense subject - very had to get a grip on. Hell - don't even know what to call it. AI? Machine Learning? Deep Learning? Machine Intelligence?

My background is android dev - I am definitely no expert in the field. Really just an interested observer. We are going cover a lot of broad ground today. But not in a lot of detail. Coming out of this presentation today, I want you to understand:* What it is* When you would want to use it* … and if you do want to use it, come away with enough information so you know what to start googling for.

My presentation will have 3 main parts

Intro - A bit of history

We are going to start today by actually looking back at the past. I hope you can forgive me for the digression, but I wanted to talk about my own personal AI journey and why I am interested in the field.

This is a picture of my very first computer - the Panasonic JB3000! It runs DOS version 1 and uses 8 1/4” floppies!

I got the computer when I was 8 years old and it came with two programs: Wordstar for word processing, and Basic for programming. I wasn’t that excited by word processing, but 8 year old me, spent a lot of time learning how to program.

The other thing I spent a lot of time doing is reading sci-fi - One of my favourites was this book! 2001 a space odyssey.

I’m not sure how many of you have read the book, or seen the film, but one of the coolest things in it, is they have this computer that controls the space ship called HAL9000

Hal is what most people think about when they hear ‘AI’

Hal 9000 is a great example of “Strong AI”. Hal can:

• Reason (use strategy, solve puzzles, and make judgments under uncertainty)

• Represent knowledge, including commonsense knowledge

• Plan

• Learn

• Communicate in natural language

• Integrate all these skills towards common goals

• Kill All Humans

He is a really good example of what we call a “general purpose” or “Strong” AI. Hal can ….




• Plan

• Learn



• Kill All Humans

https://en.wikipedia.org/wiki/Artificial_general_intelligence

Reason





• Plan

• Learn



• Kill All Humans


Represent knowledge, including commonsense knowledge





• Plan

• Learn



• Kill All Humans


Plan





• Plan

• Learn



• Kill All Humans


Lean





• Plan

• Learn



• Kill All Humans


Communicate in natural language





• Plan

• Learn



• Kill All Humans


Tie all these things together to achieve goals





• Plan

• Learn



• Kill All Humans


And also of course, kill all humans!

As a kid, I wasn’t too concerned about the ‘kill all humans’ bit, because I decided to use my computer to program up HAL!


10 Print “Good morning Dave, I am HAL 9000”

… In the BASIC programming language of course …


20 Input “What can I do for you?”, R$

Its easy enough to get the users input



???

Unfortunately it turns out that once we have the input deciding what to do, so the computer can react in an ‘intelligent’ manner is a bit tricker!



30 Print “I'm sorry, Dave. I'm afraid I can't do that.”



30 Print “I'm sorry, Dave. I'm afraid I can't do that.”

40 Goto 20

I failed miserably

You might be thinking “Hah! What an idiot! A 8 year old thinks he can write a general purpose AI in basic”. But it turns out I’m in good company!

When electronic computers were first developed 60 years ago, there was a great deal of optimism and research into artificial intelligence.

The First AI Winter

"within ten years a digital computer will be the world's chess champion" and "within ten years a digital computer will discover and prove an important new mathematical theorem.” - 1958, H. A. Simon and Allen Newell

"Within a generation ... the problem of creating 'artificial intelligence' will substantially be solved.” - 1967, Marvin Minsky

Read quotes and point out the years that they were written

Earlier AI pioneers thought that AI was going to be something relatively straightforward - something they could solve with a few grad students and a year or two of research funding.All the early optimism lead to a backlash when people started to figure out just how staggeringly difficult creating true AI was. In the 70s funding for the field collapsed, the entire idea of ‘AI’ became a tainted - We don't even like to use the term AI, we call it machine learning or machine intelligence

Present Day

• IBM’s Deep blue beats Kasparov - 1997

• DARPA self driving car challenge - 2005, 2007

• IBM’s Watson wins jeopardy - 2011

• Googles AlphaGo beats GO champion - 2015

Fortunately the AI winter is now long past. In the previous 10 or 15 years the field of AI has shown some really great breakthroughs

IBM’s chess playing computer, self driving cars, AlphaGo beating a world champion in GO

Part of this is we have made some really important breakthroughs in terms of designing and training neural nets, along with other AI techniques.

But another very important part of this, is so much of machine learning is really a data game. You need absolutely huge data sets, and the computer power to be able to process them. We are just now reaching the stage where we have enough information and CPU power to be able to do interesting things.

AI + Mobile = A great combo

All these breakthrough are really important, but I believe its in mobile where AI can have the greatest impactThe things on the previous slide, are really HARD things - wining jeopardy, self driving carsBut on mobile adding just a little bit of smarts can go a long way!example of smartwatch. Can convert pounds to grams while cookingIts not enormously difficult AI. But it can make a huge difference to peoples lives

Why AI on mobile is importantConstrained input + limited interaction means intelligence is important

The first reason why AI on mobile is important, is on mobile you have very limited input compared to desktopWide variety of sensors - motion, audio, compass, GPS, cameraVery short interaction times - users spend minutes with phones, seconds with smartwatches

Constrained input + limited interaction means intelligence is importantJust a little bit can go a long way!

Part 1 - Key AI Tech

Now lets have a look at some key AI technologies

Key AI Tech - Neural Networks

By key AI tech, I really mean neural networks. There are lots of other interesting things going on, but NNs are responsibly for a lot of the important technology we are looking at

Neural Networks - Classification problem• 2 dimensions to our data - x1, x2

• Can draw it helpfully on a graph!

if(x1 + x2 > 0){

… we are in!

}

else {

… we are out!

}

To explain neural networks we are going to start with a classifying problem!We have a data set, which has some dimensions in it - x1 and x2Perhaps what we are looking at here are customers - x1 is how often they open the app and x2 is how long they spend using it. We can clearly seperate them out into two groups.



if(x1 + x2 > 0){

… we are in!

}

else {

… we are out!

}

Super easy to write some code to classify our data. Just an if statement!



if(x1 + x2 > 0){

… we are in!

}

else {

… we are out!

}

And we have divided our data set in two!We have a classifier! However classifying things is a common problem!

Lets make our code a little bit more reusable. Lets do some refactoring!Hard coded ‘0’ is bad



if(x1 + x2 > b){

… we are in!

}

else {

… we are out!

}

Lets in introduce another variable B - our bias!With B we can move our line around a bit



if((w1 * x1) + (w2 * x2) > b){

… we are in!

}

else {

… we are out!

}

And to make things more re-useable, lets introduce some weights on our x1 and x2 parameters - this allows us to adjust the angle of the line! So if we have a different data set, we can fit it a bit better

This ‘we are in’ ‘we are out’ stuff is a bit hand wavy …



if((w1 *x1) + (w2 * x2) > b){

return 1.0;

}

else {

return 0.0;

}

Lets make the function return the probability that we are in the top data set.

Neural Networks - Classification problem

What we have actually done is gone and created a perceptron or neuron! These are basis of a neural networksThere are inputs (in this case 2 of them)The inputs have a weightThere is some sort of activation function which works off the sum of the inputs (in our case we just return 1, or 0 - a type of ‘unit step’ activation function, there are other ones like tan, or something called ‘relu’)

Neural Networks - Classification problem

Lets do a demo!

These perceptions are powerful things! In particularly because the computer can automatically adjust all the weights, to learn to fit a data set by its self! We don’t need to program in the w1, w2, b - it can figure them out by its self!So lets do a demo!

Show how the computer learnsShow how we can add more perceptrons to deal with complex data setsShow the spiral data set - hidden layers, transforming input

http://playground.tensorflow.org/#activation=tanh&batchSize=10&dataset=gauss&regDataset=reg-plane&learningRate=0.03&regularizationRate=0&noise=10&networkShape=1&seed=0.93267&showTestData=false&discretize=false&percTrainData=50&x=true&y=true&xTimesY=false&xSquared=false&ySquared=false&cosX=false&sinX=false&cosY=false&sinY=false&collectStats=false&problem=classification&initZero=false&hideText=false

Neural networks - Summary

• They learn to recognise patterns from training data!

• Work for complex patterns

• Can take a long time to train

• You have to choose the right hyper-parameters (number of nodes, how they are laid out, activation function, input transformation, etc)

What have we learnt about neural networks?

Key AI Tech - Image search and Deep belief networks

Now lets look at some of the more advanced things that have been built using neural networks.

The idea of a neural network is simple, it turns out, if you structure them in different ways, they are good for many different things.

We are going to start by looking at image search

Image search

Last year google rebranded their photo product to ‘google photos’. One thing that consistently amazed me was the search. Its able to search and organise photos without any kind of tagging or associated data. It understands what is in the images!* My boat bumblebee. It can read the writing and recognise the logo* It understands photos of my kids and what they are doing - rock climbing* Even has emoji searchThis stuff is particularly important on mobile. On my desktop I have my photos processed, organised into folders and backed up. On mobile I find myself wanting to quickly get random photos of a particular thing. eg, talking about my boat.

Lets have a look at how image search works

Deep belief networks

• Stack a number of layers together to make a DBN

• Early layers learn to recognise simple features

• Later use features to recognise larger objects

• A type of ‘auto-encoder’ - learns from input

• Early unsupervised training to recognise features

• Later supervised training to learn what they ‘mean’

Once approach that has had a lot of success in image recognition is deep belief networks

We have already seen a very basic deep belief network, doing image recognition with the swirl pattern

A deep belief network consists of a number of layers, stacked one after another. Early layers recognise simple features, later ones recognise more complex structures.

Deep belief network layers

Unsupervised Learning of Hierarchical Representations with Convolutional Deep Belief Networks Honglak Lee, Roger Grosse, Rajesh Ranganath, and Andrew Y. Ng

This is a visualisation of what each layer in a face recognition DBN is doing

Early layers recognise basic features - gradients, edgesdeeper layers combine these features into more complex objects

What is interesting is none of the faces in the last image are actual real people, they are just what the network has learned a face is

Deep belief networks

• Stack a number of layers together to make a DBN

• Early layers learn to recognise simple features

• Later use features to recognise larger objects

• A type of ‘auto-encoder’ - learns from input

• Early unsupervised training to recognise features

• Later supervised training to learn what they ‘mean’

DBN’s are a type of auto-encoder! That means they can learn by themselves

eg, when recognising faces, it doesn’t need to know the names of the people it is recognising.You can train it on a bunch of pictures of your friends and it will learn to tell them apart.Only at the final stage do some ‘supervised learning’ and teach it what the names are of each of the people it has learned to recognise.

Key AI Tech - Speech recognition and Recurrent Neural Networks

Speech recognition is another technology that is super important on mobile …

Speech Recognition

… not only just on phones, but particularly on smartwatches. Having good speech recognition enables whole new categories of IOT devices - e.g. the amazon echo wouldn’t exist without solid speech recognition.I make really heavy use of the the voice recognition on my watch. About a year ago it suddenly went from useless, to really, really good!The reason it improved so much, so quickly is google started using something called recurrent neural networks to do their speech recognition.

Speech recognition and Recurrent Neural Networks (RNNs)

• Everything up to now is a feed forward network

• In RNN’s output is fed back in as input

• Feedback serves as a type of short term memory!

• Takes a sequence of inputs

• Supplies results over time

In feed forward networks one bit of input moves from the front to the backIn recurrent neural networks outputs output is fed back in as input - like a loop!provides a type of short term memoryAlso allows RNNs to work with a stream of input and provide a sequence of results over time.Not only good for speech recognition, but anything over time, forecasting, predicting, control inputs

Key AI Tech - Ok Google, Siri and Recursive Neural Tensor Networks

Lastly we are going to look at Ok google, Siri and Recursive neural tensor networks.

Ok Google, Hey Siri

Ok google and Siri are completely changing the way we interact with devicesMoving away from an app based model of tapping buttons and icons, to an interaction model where we are having a conversation with our phone. This works well for those quick, short interactions people have with their phones.

Obviously Siri is very complex - Lots of different bits - e.g., the speech recognition, search engine, data mashups.

One of the most important parts to conversational agents is figuring out what the user is actually asking - natural language processing and sentence parsing

Sentence parsing

We need to be able to extract out nouns, verbs, etc and figure out how they all relate to each other. We need to be able to build a parse tree of our sentence.

It turns out, in english there are many ways that this sentence can be understood …

Alice drove down the street in her car

For example is Alice driving down a street inside her car? What is in the car? Alice? Or the Street?Well, we know it probably isn’t a street. But how would a computer figure that out?

Recursive Neural Tensor Networks (RNTN’s)

• Good for recognising hierarchal structures

• Tree like structure - root node + left and right

• Just like RNN’s, the complexity is in how they are invoked

Root

Leaf Leaf

RNTNs very good for sentence parsing and anything that involves hierarchal data.Have a tree like structure …

Recursive Neural Tensor Networks

The car

is

fast

RNTNs are invoked recursively, feeding the output of each previous step, back into the network


The car

At step one we feed the first parts of the sentence into the first two leaf nodes


The car

Class, Score

root node outputs two values - class and scoreclass is a representation of the current parse treescore is how likely that parse tree is


The car

is

At the next step, we feed the output back into the left node. The next word gets fed into the right


The car

is

fast

This continues on until all values are included.


The car

is

fast

… and we have our final output - a parse tree and how likely it is.

In something like google now, it would actually be more complex. To parse english an RNTN needs to evaluate all possible sentence structures, and determine which one is most likely

This is a simplified example, to show the main idea behind an RNTN - the recursion to represent a tree structure. RNTN’s can be used to understand anything with a hierarchy

Part 2 - TensorFlow

We are going to step aside from the different kinds of neural networks and have a look at … TensorFlow!

Tensor What?

TensorFlow is a library for implementing various AI techniques - Actually, it turns out there is nothing AI specific about it. Just a bunch of things for doing tricky maths!All based on data flow graphs (play video) - Build a model, which is a network of nodes. Each node takes multidimensional arrays (called tensors) as input and output. Tensors ‘flow’ between nodesIt turns out this is a really great way to build computations which are Portable, parallelizable, can run on GPU, etcLots of existing models built using it - will look at some soon.

TensorFlow - some code$ python

...

>>> import tensorflow as tf

>>> hello = tf.constant('Hello, TensorFlow!')

>>> sess = tf.Session()

>>> print(sess.run(hello))

Hello, TensorFlow!

>>> a = tf.constant(10)

>>> b = tf.constant(32)

>>> print(sess.run(a + b))

42

>>>

Very quickly, here is what some code looks likeIts python - there are other language bindings, including C++define a node (constant), build a session , then tell the session to runSession evaluates our nodes, passes data between them, we get output!

Building an app with TensorFlow

If you want to build an app with AI, it turns out the simplest approach …

TensorNo - Use existing APIs• Cloud Vision - https://cloud.google.com/vision/

• Cloud Natural language - https://cloud.google.com/natural-language/

• Cloud speech - https://cloud.google.com/speech/

• Google Cloud Machine Learning - run tensor flow models in the cloud

• Amazon Machine Learning - https://aws.amazon.com/machine-learning/

• Algorithmia - algorithm marketplace - https://algorithmia.com/algorithms

… is not to use TensorFlow at all! Actually it turns out this stuff is really hard!

Fortunately there are loads of great cloud services availableSuper easy to use - just send in some json, get resultsDon’t need to worry about how they work.

https://cloud.google.com/vision/

https://cloud.google.com/natural-language/

https://cloud.google.com/speech/

https://aws.amazon.com/machine-learning/

https://aws.amazon.com/machine-learning/

https://algorithmia.com/algorithms

TensorFlow on the Server side - DIY

echo 'Bob brought the pizza to Alice.' | syntaxnet/demo.sh

Input: Bob brought the pizza to Alice . Parse: brought VBD ROOT +-- Bob NNP nsubj +-- pizza NN dobj | +-- the DT det +-- to IN prep | +-- Alice NNP pobj +-- . . punct

If you want to do something a little bit more complex than the cloud services provide, you can run tensor flow on the server side.

There are a few options for how you can serve up tensor flow models. A hacky way is to just run a shell script - parse input and outputyou can build an API around it.

TensorFlow on the Server side - gRPC

Another approach to talking to TensorFlow on the server is something called gRPC.gRPC is google remote procedure call library.You define methods your server has in a .proto file, then you can generated language bindings to call the methods over HTTP, or TCP, etcKind of complicatedTensorFlow has built in support for exporting models via gRPC

Roll your own - Client side• Use the tensor flow C++ library with Android NDK or for iOS in

Xcode

• Load up a prebuilt model, and feed data into it

• Need to use the Bazel build system for Android!

• Currently nasty mix of C, C++, Python, Java, SWIG, etc, various build systems, etc

• Changes in progress - java api, common C api.

One of the great things about tensor flow is its portable. We could also run it on the mobile phone!Can use tensor flow C++ library

Again, it turns out to be kind of complicated …

But what about the model?• Build your own? You need:

• A LARGE data set - MINST has 50k images. ImageNet has 500gb of thumbnails

• A HUGE computer - 8 Tesla K40 GPU’s ($4216 each), 40GB memory, etc

• A lot of time - Weeks! Months!

• Some PHD’s to run it all for you.

Its all well and good to be able to run tensorflow, but what about our model? Were do we get our neural net from?One option is use TensorFlow to build your own!

You are going to need …

I don’t want to put you off playing with AI, because individual people have achieved really cool things. But google level image recognition isn’t something your going to hack out in a weekend. Its something you will have to take seriously and really dedicate resources to. Fortunately we are at the stage where this stuff is at least feasible. A $32,000 computer may not be hobbyist level, but a lot of my customers would spend at least that much getting a mobile app built.

But what about the model?• Use an existing one!

• SyntaxNet and Parsey McParseface - Parse text, Parse english text to extract structure

• TextSum - Text Summarisation

• Show and Tell - Image caption generator

• Inception - Image recognition

It building an new model is too hard, what about using somebody else’s? One already built and trained by a team of PHD’s

There are a load of really interesting models google has published to github!

Also a really interesting image recognition model called Inception - Trained with the ImageNet academic data set - List of 1000 objects it can recognise21.2% top-1 and 5.6% top-5 error - not bad!It represents a load of research and its just sitting there on github!

Part 3 - DemoRecognising a banana!

This leads us very nicely into our demo - We have our model - Inception, which is already trained to recognise 1000 objects, including all sorts of fruit!

We have tensor flow to allow us to run models on various devices including mobile phones.

Lets tie them together to use our phone to recognise a banana!

TensorFlow example app

Firstly, I need to be clear - this isn’t something I developed - its a pre existing demo app in the TensorFlow github repo!Here is an overview of what the app is doing.This is what is looks like on the left!We have some android code, which uses the camera to take an image.Those images get sent into C++ code, which uses the TensorFlow C++ library to load and evaluate the inception model.The C++ code returns out classification and confidence scores, which are shown in the android app ui.

The DEMO

Yay! We recognise a banana

Closing thoughts

We have seen some really amazing breakthroughs in AI. The real time image recognition on the phone, was something I never thought I would see happen in my lifetime. It floored me when I first saw it run.

I wanted to bring things full circle and see how the technology today, stacks up against our definition of a strong, or hard AI.



• Plan

• Learn



• Kill All Humans


Are we any closer to strong AI?

Are we any closer to strong ai?

Can we reason? Solve puzzles, make judgment under uncertainty?




• Plan

• Learn



• Kill All Humans



Yes! Lots of puzzle solving (chess, alphago) , NN’s are great at uncertainty

What about representing knowledge?




• Plan

• Learn



• Kill All Humans



No! Its not something we really know how to do, particuarly commonsense knowledge. Computers are still awful as commonsense.

Can we plan?




• Plan

• Learn



• Kill All Humans



… well … sort of. There are custom algorithms for things like route planning, and stuff like forecasting

But it doesn’t seem to be a really generally solved problem.

Can we learn?




• Plan

• Learn



• Kill All Humans



Yes! Its what NN’s are all about! Learning from input data!

Can we communicate in natural language




• Plan

• Learn



• Kill All Humans



Actually, yes! We are kind of getting pretty good at that. Siri, Ok google, conversational agents and bots are improving all the time.

Lastly, can we integrate all these skills towards common goals?




• Plan

• Learn



• Kill All Humans



Well, no. We still have no idea how to do that. Which is not surprising, since we have a bunch of things earlier on the list not ticked off.

Lastly, what about this “Kill all humans” stuff?




• Plan

• Learn



• Kill All Humans



Well it makes for good books, but I don’t think thats going to happen!

While we will not have an AI that has the goal ‘kill all humans’, that doesn’t mean we will have AI which will work for the betterment of all mankind.




• Plan

• Learn



• Kill All Humans



Its entirely possible that 15 years down the line there will be a few more breakthroughs and google with have an AI with the goal “Maximise the value of google shares”Great for google shareholders - not so much the rest of usThinking we are going to be wiped out by AI is a bit far fetched - but thinking that it is going to benefit some of us, not all of us, is entirely plausible!


AI needs to be for everyone!

We can’t just leave it in the hands of the big 4. Google has taken great steps with releasing code and research. We need to build on that, to understand it, to use it in our own software, to decomocratize AIAI needs to be for everyone

… especially for mobile!

… and it needs to be especially for mobile

We have seen how AI is particularly useful for mobile.We have seen how there are useful things, you can do today, with very little effortI want you all to start thinking about how to can use these techniques to bring value to your users!I want you to use what you have learnt and go out and build great things!

Questions

Tensor what? - Luke Sleemanlukesleeman.com.au/wp-content/uploads/2017/02/AI-with-notes.pdf ·...

Documents

Transcript of Tensor what? - Luke Sleemanlukesleeman.com.au/wp-content/uploads/2017/02/AI-with-notes.pdf ·...