Download - What is Machine Learning and why should you care? · Antons Mislēvičs Head of Machine Learning & Cognitive Computing Center of Competence C.T.Co What is Machine Learning and why

Antons MislēvičsHead of Machine Learning & Cognitive

Computing Center of Competence

C.T.Co

What is Machine Learning and why should you care?

Agenda

1. What machines can do today?

2. How Machine Learning works?

3. How to implement Machine Learning projects?

What machines can do today?

Go: Google AlphaGo 4 – Lee Sedol 1 (2016)

AlphaGo

https://deepmind.com/alpha-go

https://deepmind.com/alpha-go

Poker: Libratus and DeepStack beat top pros (2017)

Carnegie Mellon Artificial Intelligence Beats Top Poker Pros: https://www.cmu.edu/news/stories/archives/2017/january/AI-beats-poker-pros.html

Brains Vs. AI Rematch: Why Poker?: https://www.youtube.com/watch?v=JtyA2aUj4WI

Tough poker player: Brains Vs. AI update: https://www.youtube.com/watch?v=CRiH8yCskAE

Safe and Nested Endgame Solving for Imperfect-Information Games, N. Brown, T. Sandholm, 2017: http://www.cs.cmu.edu/~sandholm/safeAndNested.aaa17WS.pdf

DeepStack: Expert-Level Artificial Intelligence in No-Limit Poker, 2017: https://arxiv.org/abs/1701.01724

https://www.cmu.edu/news/stories/archives/2017/january/AI-beats-poker-pros.html

https://www.youtube.com/watch?v=JtyA2aUj4WI

https://www.youtube.com/watch?v=CRiH8yCskAE

http://www.cs.cmu.edu/~sandholm/safeAndNested.aaa17WS.pdf

https://arxiv.org/abs/1701.01724

AlphaZero learns chess

Google's AlphaZero Destroys Stockfish In 100-Game Match, 2017: https://www.chess.com/news/view/google-s-alphazero-destroys-stockfish-in-100-game-match

https://www.chess.com/news/view/google-s-alphazero-destroys-stockfish-in-100-game-match

Large Scale Visual Recognition Challenge (ILSVRC)

2015 challenge:

– Object detection - 200 categories

– Object recognition – 1000 categories

– Object detection from video – 30 categories

– Scene classification – 401 categories

Large Scale Visual Recognition Challenge 2015 (ILSVRC2015)

http://image-net.org/challenges/LSVRC/2015/index#maincomp

http://image-net.org/challenges/LSVRC/2015/index#maincomp

Large Scale Visual Recognition Challenge 2015 – Results: http://image-net.org/challenges/LSVRC/2015/results

Microsoft Researchers’ Algorithm Sets ImageNet Challenge Milestone, 2015: https://www.microsoft.com/en-us/research/microsoft-researchers-algorithm-sets-imagenet-challenge-milestone/

Microsoft Research Team:

“To our knowledge, our result is the first to surpass human-

level performance…on this visual recognition challenge”

http://image-net.org/challenges/LSVRC/2015/results

https://www.microsoft.com/en-us/research/microsoft-researchers-algorithm-sets-imagenet-challenge-milestone/

Machines can understand the meaning…

Show and Tell: A Neural Image Caption Generator, O. Vinyals, A. Toshev, S. Bengio, D. Erhan, 2015: http://arxiv.org/abs/1411.4555v2

https://arxiv.org/abs/1411.4555v2

Unifying Visual-Semantic Embeddings with Multimodal Neural Language Models, R. Kiros, R. Salakhutdinov, R. S. Zemel, 2014: http://arxiv.org/abs/1411.2539


Pixel Level Segmentation

Mask R-CNN: https://arxiv.org/abs/1703.06870

A Brief History of CNNs in Image Segmentation: From R-CNN to Mask R-CNN:

https://blog.athelas.com/a-brief-history-of-cnns-in-image-segmentation-from-r-cnn-to-mask-r-cnn-34ea83205de4


https://blog.athelas.com/a-brief-history-of-cnns-in-image-segmentation-from-r-cnn-to-mask-r-cnn-34ea83205de4

2. The Next Rembrandt [4]

1. A Neural Algorithm of Artistic Style, 2015: https://arxiv.org/abs/1508.06576

2. Supercharging Style Transfer, 2016: https://research.googleblog.com/2016/10/supercharging-style-transfer.html

3. Neural Doodle:, 2016: https://github.com/alexjc/neural-doodle

4. The Next Rembrandt: https://www.nextrembrandt.com/

5. Image Completion with Deep Learning in TensorFlow, 2016: https://bamos.github.io/2016/08/09/deep-completion/

6. Neural Enhance, 2016: https://github.com/alexjc/neural-enhance

Machines get creative…1. Reproduce artistic style [1, 2, 3]

4. Enhance images [6]

3. Complete images [5]


https://research.googleblog.com/2016/10/supercharging-style-transfer.html

https://github.com/alexjc/neural-doodle

https://www.nextrembrandt.com/

https://bamos.github.io/2016/08/09/deep-completion/

https://github.com/alexjc/neural-enhance

1. WaveNet: A Generative Model for Raw Audio: https://deepmind.com/blog/wavenet-generative-model-raw-audio/

2. WaveNet: A Generative Model for Raw Audio, 2016: https://arxiv.org/abs/1609.03499

3. Historic Achievement: Microsoft researchers reach human parity in conversational speech recognition: https://blogs.microsoft.com/next/2016/10/18/historic-achievement-microsoft-researchers-reach-human-parity-conversational-speech-recognition/

4. Achieving Human Parity in Conversational Speech Recognition, 2016: http://arxiv.org/abs/1610.05256

Text to speech and voice recognition…

2. Recognize voice [3, 4]

1. Talk - text to speech (WaveNet) [1, 2]

https://deepmind.com/blog/wavenet-generative-model-raw-audio/


https://blogs.microsoft.com/next/2016/10/18/historic-achievement-microsoft-researchers-reach-human-parity-conversational-speech-recognition/

http://arxiv.org/abs/1610.05256

2. Generate handwriting [2, 3]

3. Translate texts [4, 5]

1. Composing Music With Recurrent Neural Networks: http://www.hexahedria.com/2015/08/03/composing-music-with-recurrent-neural-networks/

2. Generating Sequences With Recurrent Neural Networks, A. Graves, 2014: http://arxiv.org/abs/1308.0850

3. Alex Graves’s RNN handwriting generation demo: http://www.cs.toronto.edu/~graves/handwriting.html

4. University of Montreal, Lisa Lab, Neural Machine Translation demo: http://lisa.iro.umontreal.ca/mt-demo

5. Fully Character-Level Neural Machine Translation without Explicit Segmentation, J.Lee, K. Cho, T. Hofmann, 2016: http://arxiv.org/abs/1610.03017

What else machines can do?

1. Compose music [1]

http://www.hexahedria.com/2015/08/03/composing-music-with-recurrent-neural-networks/


http://www.cs.toronto.edu/~graves/handwriting.html

http://lisa.iro.umontreal.ca/mt-demo


Google Duplex: A.I. Assistant Calls Local Businesses To Make Appointments: https://www.youtube.com/watch?v=D5VN56jQMWM

https://www.youtube.com/watch?v=D5VN56jQMWM

How Machine Learning works?

Machine Learning process

Introducing Azure Machine Learning, D. Chappell, 2015:

http://www.davidchappell.com/writing/white_papers/Introducing-Azure-ML-v1.0--Chappell.pdf


Machine Learning questions

1. How much / how many? Regression

2. Which category? Classification

3. Which groups? Clustering

4. Is it weird? Anomaly Detection

5. Which action? Reinforcement Learning

Data Science for Rest of Us, B. Rohrer, 2015:

https://channel9.msdn.com/blogs/Cloud-and-Enterprise-Premium/Data-Science-for-Rest-of-Us


Regression: how much / how many?

Housing prices by square feetPrice Square Feet

125,999 950

207,190 1125

227,555 1400

319,010 1750

345,846 1525

350,000 1690

437,301 2120

450,999 2500

605,000 3010

641,370 3250

824,280 3600

1,092,640 3700

1,187,550 4500

0

500

1000

1500

2000

2500

3000

3500

4000

4500

5000

0 100,000

200,000

300,000

400,000

500,000

600,000

700,000

800,000

900,000

1,000,000

1,100,000

1,200,000

Input variable/FeatureOutput variable

Housing prices hypothesisPrice Square Feet

125,999 950

207,190 1125

227,555 1400

319,010 1750

345,846 1525

350,000 1690

437,301 2120

450,999 2500

605,000 3010

641,370 3250

824,280 3600

1,092,640 3700

1,187,550 4500

0

500

1000

1500

2000

2500

3000

3500

4000

4500

5000

0 100,000

200,000

300,000

400,000

500,000

600,000

700,000

800,000

900,000

1,000,000

1,100,000

1,200,000

Hypothesis

Using model to predict house pricePrice Square Feet

125,999 950

207,190 1125

227,555 1400

319,010 1750

345,846 1525

350,000 1690

437,301 2120

450,999 2500

??? 2700

605,000 3010

641,370 3250

824,280 3600

1,092,640 3700

1,187,550 4500

0

500

1000

1500

2000

2500

3000

3500

4000

4500

5000

0 100,000

200,000

300,000

400,000

500,000

600,000

700,000

800,000

900,000

1,000,000

1,100,000

1,200,000

Prediction “errors” & improving modelsPrice Square Feet

125,999 950

207,190 1125

227,555 1400

319,010 1750

345,846 1525

350,000 1690

437,301 2120

450,999 2500

??? 2700

605,000 3010

641,370 3250

824,280 3600

1,092,640 3700

1,187,550 4500

0

500

1000

1500

2000

2500

3000

3500

4000

4500

5000

0 100,000

200,000

300,000

400,000

500,000

600,000

700,000

800,000

900,000

1,000,000

1,100,000

1,200,000

Cost function (sq. error function)

Try different algorithmPrice Square Feet

125,999 950

207,190 1125

227,555 1400

319,010 1750

345,846 1525

350,000 1690

437,301 2120

450,999 2500

??? 2700

605,000 3010

641,370 3250

824,280 3600

1,092,640 3700

1,187,550 4500

0

500

1000

1500

2000

2500

3000

3500

4000

4500

5000

0 100,000

200,000

300,000

400,000

500,000

600,000

700,000

800,000

900,000

1,000,000

1,100,000

1,200,000

Get more data

0

1000

2000

3000

4000

5000

6000

0 100,000

200,000

300,000

400,000

500,000

600,000

700,000

800,000

900,000

1,000,000

1,100,000

1,200,000

1,300,000

1,400,000

1,500,000

1,600,000

1,700,000

1,800,000

1,900,000

2,000,000

2,100,000

Use more data variables (features)Price Square Feet # Bedrooms # Bathrooms Fireplaces Garage Size Floors

125,999 950 1 1 0 0 1

207,190 1125 1 1 0 1 1

227,555 1400 2 1.5 1 2 1

319,010 1750 2 1.5 0 2 2

345,846 1525 3 2 1 2 1

350,000 1690 3 1.5 1 2 1.5

437,301 2120 3 2.5 2 3 2

450,999 2500 3 2.5 1 2 1.5

605,000 3010 4 2.5 2 3 2

641,370 3250 3 3 1 3 2

824,280 3600 3 3 2 3 2

1,092,640 3700 5 4.5 2 3 2

1,187,550 4500 6 6 4 5 2

Classification: which category?

Years

driving Age Class

5 65 1

7 70 1

2 68 1

25 45 2

25 55 2

20 50 2

5 25 1

3 22 1

8 30 1

15 35 2

… … …

12 38 ???

Classification – 2 classes

Hypothesis / Classifier

Years driving

Age

Input variables/Features Output variable

WALL·E (2008): http://www.imdb.com/title/tt0910970/

1. Input variables / Features:

– Years driving

– Age

2. Output variable:

– Class: Yellow, Green, Blue

3. One-vs-rest approach:

– Train classifier for each class

– Select class that returned highest confidence score

Classification – more than 2 classes

Classifier 3

Years driving

Age

Classifier 2

Classifier 1

https://quickdraw.withgoogle.com

https://clarifai.com/demo

Why Deep Learning and Neural Networks?

http://playground.tensorflow.org/

Predicting Azure Customer Churn

Predicting Azure Churn with Deep Learning and Explaining Predictions with LIME, 2017: https://www.slideshare.net/FengZhu18/predicting-azure-churn-with-deep-learning-and-explaining-predictions-with-lime

https://www.slideshare.net/FengZhu18/predicting-azure-churn-with-deep-learning-and-explaining-predictions-with-lime

But wait…

Is it that simple?

Aren’t machines as smart as we are?

Deep neural networks can be fooled…

Deep Neural Networks are Easily Fooled: High Confidence Predictions for Unrecognizable Images, 2015: https://arxiv.org/abs/1412.1897


Machines do not always “understand” images…

Accelerating innovation and powering new experiences with AI: https://code.facebook.com/posts/310100219388873/accelerating-innovation-and-powering-new-experiences-with-ai/

CS231n: Convolutional Neural Networks for Visual Recognition, Lecture 10: Recurrent Neural Networks: http://cs231n.stanford.edu/slides/winter1516_lecture10.pdf

https://code.facebook.com/posts/310100219388873/accelerating-innovation-and-powering-new-experiences-with-ai/

http://cs231n.stanford.edu/slides/winter1516_lecture10.pdf

Introducing GeForce GTX TITAN Z: Ultimate Power, May 2014.

http://www.geforce.com/whats-new/articles/introducing-nvidia-geforce-gtx-titan-z

http://www.geforce.com/whats-new/articles/introducing-nvidia-geforce-gtx-titan-z

Chinese Room Experiment

Stanford Encyclopedia of Philosophy – The Chinese Room Argument: https://plato.stanford.edu/entries/chinese-room/

The “Chinese room” argument: http://cse3521.artifice.cc/chinese-room.html

https://plato.stanford.edu/entries/chinese-room/

http://cse3521.artifice.cc/chinese-room.html

How to implement Machine Learning project?

1. Define business value and

how it can be measured

- How much / how many?

- Which category?

- Which groups?

- Is it weird?

- Which action?

3. Build models:

- “Black-box” – pure statistical analysis

of large amounts of data

- ”Soft-box” – heuristic insights from the

knowledge of experts

4. Integrate into production systems

- Adjust business processes

- Redesign existing systems

5. Drive adoption!

Implementing Machine Learning project2. Prepare data:

- Internal data sources

- External data sources

Typical Machine Learning scenarios

Customers– Recommendations

– Customer Churn

– Customer Segmentation

Operations– Predictive Maintenance

– Anomaly Detection

– Optimization

Security & Risk– Credit Risk

– Fraud Detection

– Predict Security Threat

HR– Employee Retention

– Talent Management

– Candidate Evaluation

Suggestions

1. Work on multiple scenarios / hypothesis in parallel:– Evaluate risk level for every scenario (chance to succeed)

– Pick scenarios to balance overall risk

2. Hypothesis based approach:– Define hypothesis

– Implement time boxed POC (2-4 weeks)

– Analyze results

– Decision point: Go to pilot implementation | Continue with the next hypothesis | Abandon scenario

3. Focus on bringing ML projects into pilot/production:– Helps to validate model on real data

– Allows to test end-to-end scenario to see if solution can bring business value

– Allows to understand additional requirements

4. Try Deep Learning methods to replace existing ML models:– Allows to reduce effort required for pre-processing data (feature engineering)

– Allows to add new data points to the model

Slides: https://1drv.ms/f/s!AsXSX3Q3cMlAjrcTdvNFSpIgHfqAcA

https://1drv.ms/f/s!AsXSX3Q3cMlAjrcTdvNFSpIgHfqAcA