Antons MislēvičsHead of Machine Learning & Cognitive
Computing Center of Competence
C.T.Co
What is Machine Learning and why should you care?
Agenda
1. What machines can do today?
2. How Machine Learning works?
3. How to implement Machine Learning projects?
What machines can do today?
Go: Google AlphaGo 4 – Lee Sedol 1 (2016)
AlphaGo
https://deepmind.com/alpha-go
Poker: Libratus and DeepStack beat top pros (2017)
Carnegie Mellon Artificial Intelligence Beats Top Poker Pros: https://www.cmu.edu/news/stories/archives/2017/january/AI-beats-poker-pros.html
Brains Vs. AI Rematch: Why Poker?: https://www.youtube.com/watch?v=JtyA2aUj4WI
Tough poker player: Brains Vs. AI update: https://www.youtube.com/watch?v=CRiH8yCskAE
Safe and Nested Endgame Solving for Imperfect-Information Games, N. Brown, T. Sandholm, 2017: http://www.cs.cmu.edu/~sandholm/safeAndNested.aaa17WS.pdf
DeepStack: Expert-Level Artificial Intelligence in No-Limit Poker, 2017: https://arxiv.org/abs/1701.01724
AlphaZero learns chess
Google's AlphaZero Destroys Stockfish In 100-Game Match, 2017: https://www.chess.com/news/view/google-s-alphazero-destroys-stockfish-in-100-game-match
Large Scale Visual Recognition Challenge (ILSVRC)
2015 challenge:
– Object detection - 200 categories
– Object recognition – 1000 categories
– Object detection from video – 30 categories
– Scene classification – 401 categories
Large Scale Visual Recognition Challenge 2015 (ILSVRC2015)
http://image-net.org/challenges/LSVRC/2015/index#maincomp
Large Scale Visual Recognition Challenge 2015 – Results: http://image-net.org/challenges/LSVRC/2015/results
Microsoft Researchers’ Algorithm Sets ImageNet Challenge Milestone, 2015: https://www.microsoft.com/en-us/research/microsoft-researchers-algorithm-sets-imagenet-challenge-milestone/
Microsoft Research Team:
“To our knowledge, our result is the first to surpass human-
level performance…on this visual recognition challenge”
Machines can understand the meaning…
Show and Tell: A Neural Image Caption Generator, O. Vinyals, A. Toshev, S. Bengio, D. Erhan, 2015: http://arxiv.org/abs/1411.4555v2
Unifying Visual-Semantic Embeddings with Multimodal Neural Language Models, R. Kiros, R. Salakhutdinov, R. S. Zemel, 2014: http://arxiv.org/abs/1411.2539
Pixel Level Segmentation
Mask R-CNN: https://arxiv.org/abs/1703.06870
A Brief History of CNNs in Image Segmentation: From R-CNN to Mask R-CNN:
https://blog.athelas.com/a-brief-history-of-cnns-in-image-segmentation-from-r-cnn-to-mask-r-cnn-34ea83205de4
2. The Next Rembrandt [4]
1. A Neural Algorithm of Artistic Style, 2015: https://arxiv.org/abs/1508.06576
2. Supercharging Style Transfer, 2016: https://research.googleblog.com/2016/10/supercharging-style-transfer.html
3. Neural Doodle:, 2016: https://github.com/alexjc/neural-doodle
4. The Next Rembrandt: https://www.nextrembrandt.com/
5. Image Completion with Deep Learning in TensorFlow, 2016: https://bamos.github.io/2016/08/09/deep-completion/
6. Neural Enhance, 2016: https://github.com/alexjc/neural-enhance
Machines get creative…1. Reproduce artistic style [1, 2, 3]
4. Enhance images [6]
3. Complete images [5]
1. WaveNet: A Generative Model for Raw Audio: https://deepmind.com/blog/wavenet-generative-model-raw-audio/
2. WaveNet: A Generative Model for Raw Audio, 2016: https://arxiv.org/abs/1609.03499
3. Historic Achievement: Microsoft researchers reach human parity in conversational speech recognition: https://blogs.microsoft.com/next/2016/10/18/historic-achievement-microsoft-researchers-reach-human-parity-conversational-speech-recognition/
4. Achieving Human Parity in Conversational Speech Recognition, 2016: http://arxiv.org/abs/1610.05256
Text to speech and voice recognition…
2. Recognize voice [3, 4]
1. Talk - text to speech (WaveNet) [1, 2]
2. Generate handwriting [2, 3]
3. Translate texts [4, 5]
1. Composing Music With Recurrent Neural Networks: http://www.hexahedria.com/2015/08/03/composing-music-with-recurrent-neural-networks/
2. Generating Sequences With Recurrent Neural Networks, A. Graves, 2014: http://arxiv.org/abs/1308.0850
3. Alex Graves’s RNN handwriting generation demo: http://www.cs.toronto.edu/~graves/handwriting.html
4. University of Montreal, Lisa Lab, Neural Machine Translation demo: http://lisa.iro.umontreal.ca/mt-demo
5. Fully Character-Level Neural Machine Translation without Explicit Segmentation, J.Lee, K. Cho, T. Hofmann, 2016: http://arxiv.org/abs/1610.03017
What else machines can do?
1. Compose music [1]
Google Duplex: A.I. Assistant Calls Local Businesses To Make Appointments: https://www.youtube.com/watch?v=D5VN56jQMWM
How Machine Learning works?
Machine Learning process
Introducing Azure Machine Learning, D. Chappell, 2015:
http://www.davidchappell.com/writing/white_papers/Introducing-Azure-ML-v1.0--Chappell.pdf
Machine Learning questions
1. How much / how many? Regression
2. Which category? Classification
3. Which groups? Clustering
4. Is it weird? Anomaly Detection
5. Which action? Reinforcement Learning
Data Science for Rest of Us, B. Rohrer, 2015:
https://channel9.msdn.com/blogs/Cloud-and-Enterprise-Premium/Data-Science-for-Rest-of-Us
Regression: how much / how many?
$???
Housing prices by square feetPrice Square Feet
125,999 950
207,190 1125
227,555 1400
319,010 1750
345,846 1525
350,000 1690
437,301 2120
450,999 2500
605,000 3010
641,370 3250
824,280 3600
1,092,640 3700
1,187,550 4500
0
500
1000
1500
2000
2500
3000
3500
4000
4500
5000
0 100,000
200,000
300,000
400,000
500,000
600,000
700,000
800,000
900,000
1,000,000
1,100,000
1,200,000
Input variable/FeatureOutput variable
Housing prices hypothesisPrice Square Feet
125,999 950
207,190 1125
227,555 1400
319,010 1750
345,846 1525
350,000 1690
437,301 2120
450,999 2500
605,000 3010
641,370 3250
824,280 3600
1,092,640 3700
1,187,550 4500
0
500
1000
1500
2000
2500
3000
3500
4000
4500
5000
0 100,000
200,000
300,000
400,000
500,000
600,000
700,000
800,000
900,000
1,000,000
1,100,000
1,200,000
Hypothesis
Using model to predict house pricePrice Square Feet
125,999 950
207,190 1125
227,555 1400
319,010 1750
345,846 1525
350,000 1690
437,301 2120
450,999 2500
??? 2700
605,000 3010
641,370 3250
824,280 3600
1,092,640 3700
1,187,550 4500
0
500
1000
1500
2000
2500
3000
3500
4000
4500
5000
0 100,000
200,000
300,000
400,000
500,000
600,000
700,000
800,000
900,000
1,000,000
1,100,000
1,200,000
Prediction “errors” & improving modelsPrice Square Feet
125,999 950
207,190 1125
227,555 1400
319,010 1750
345,846 1525
350,000 1690
437,301 2120
450,999 2500
??? 2700
605,000 3010
641,370 3250
824,280 3600
1,092,640 3700
1,187,550 4500
0
500
1000
1500
2000
2500
3000
3500
4000
4500
5000
0 100,000
200,000
300,000
400,000
500,000
600,000
700,000
800,000
900,000
1,000,000
1,100,000
1,200,000
Cost function (sq. error function)
Try different algorithmPrice Square Feet
125,999 950
207,190 1125
227,555 1400
319,010 1750
345,846 1525
350,000 1690
437,301 2120
450,999 2500
??? 2700
605,000 3010
641,370 3250
824,280 3600
1,092,640 3700
1,187,550 4500
0
500
1000
1500
2000
2500
3000
3500
4000
4500
5000
0 100,000
200,000
300,000
400,000
500,000
600,000
700,000
800,000
900,000
1,000,000
1,100,000
1,200,000
Get more data
0
1000
2000
3000
4000
5000
6000
0 100,000
200,000
300,000
400,000
500,000
600,000
700,000
800,000
900,000
1,000,000
1,100,000
1,200,000
1,300,000
1,400,000
1,500,000
1,600,000
1,700,000
1,800,000
1,900,000
2,000,000
2,100,000
Use more data variables (features)Price Square Feet # Bedrooms # Bathrooms Fireplaces Garage Size Floors
125,999 950 1 1 0 0 1
207,190 1125 1 1 0 1 1
227,555 1400 2 1.5 1 2 1
319,010 1750 2 1.5 0 2 2
345,846 1525 3 2 1 2 1
350,000 1690 3 1.5 1 2 1.5
437,301 2120 3 2.5 2 3 2
450,999 2500 3 2.5 1 2 1.5
605,000 3010 4 2.5 2 3 2
641,370 3250 3 3 1 3 2
824,280 3600 3 3 2 3 2
1,092,640 3700 5 4.5 2 3 2
1,187,550 4500 6 6 4 5 2
Classification: which category?
Years
driving Age Class
5 65 1
7 70 1
2 68 1
25 45 2
25 55 2
20 50 2
5 25 1
3 22 1
8 30 1
15 35 2
… … …
12 38 ???
Classification – 2 classes
Hypothesis / Classifier
Years driving
Age
Input variables/Features Output variable
WALL·E (2008): http://www.imdb.com/title/tt0910970/
1. Input variables / Features:
– Years driving
– Age
2. Output variable:
– Class: Yellow, Green, Blue
3. One-vs-rest approach:
– Train classifier for each class
– Select class that returned highest confidence score
Classification – more than 2 classes
Classifier 3
Years driving
Age
Classifier 2
Classifier 1
https://quickdraw.withgoogle.com
https://quickdraw.withgoogle.com
Why Deep Learning and Neural Networks?
http://playground.tensorflow.org/
Predicting Azure Customer Churn
Predicting Azure Churn with Deep Learning and Explaining Predictions with LIME, 2017: https://www.slideshare.net/FengZhu18/predicting-azure-churn-with-deep-learning-and-explaining-predictions-with-lime
But wait…
Is it that simple?
Aren’t machines as smart as we are?
Deep neural networks can be fooled…
Deep Neural Networks are Easily Fooled: High Confidence Predictions for Unrecognizable Images, 2015: https://arxiv.org/abs/1412.1897
Machines do not always “understand” images…
Accelerating innovation and powering new experiences with AI: https://code.facebook.com/posts/310100219388873/accelerating-innovation-and-powering-new-experiences-with-ai/
CS231n: Convolutional Neural Networks for Visual Recognition, Lecture 10: Recurrent Neural Networks: http://cs231n.stanford.edu/slides/winter1516_lecture10.pdf
Introducing GeForce GTX TITAN Z: Ultimate Power, May 2014.
http://www.geforce.com/whats-new/articles/introducing-nvidia-geforce-gtx-titan-z
Chinese Room Experiment
Stanford Encyclopedia of Philosophy – The Chinese Room Argument: https://plato.stanford.edu/entries/chinese-room/
The “Chinese room” argument: http://cse3521.artifice.cc/chinese-room.html
How to implement Machine Learning project?
1. Define business value and
how it can be measured
- How much / how many?
- Which category?
- Which groups?
- Is it weird?
- Which action?
3. Build models:
- “Black-box” – pure statistical analysis
of large amounts of data
- ”Soft-box” – heuristic insights from the
knowledge of experts
4. Integrate into production systems
- Adjust business processes
- Redesign existing systems
5. Drive adoption!
Implementing Machine Learning project2. Prepare data:
- Internal data sources
- External data sources
Typical Machine Learning scenarios
Customers– Recommendations
– Customer Churn
– Customer Segmentation
Operations– Predictive Maintenance
– Anomaly Detection
– Optimization
Security & Risk– Credit Risk
– Fraud Detection
– Predict Security Threat
HR– Employee Retention
– Talent Management
– Candidate Evaluation
Suggestions
1. Work on multiple scenarios / hypothesis in parallel:– Evaluate risk level for every scenario (chance to succeed)
– Pick scenarios to balance overall risk
2. Hypothesis based approach:– Define hypothesis
– Implement time boxed POC (2-4 weeks)
– Analyze results
– Decision point: Go to pilot implementation | Continue with the next hypothesis | Abandon scenario
3. Focus on bringing ML projects into pilot/production:– Helps to validate model on real data
– Allows to test end-to-end scenario to see if solution can bring business value
– Allows to understand additional requirements
4. Try Deep Learning methods to replace existing ML models:– Allows to reduce effort required for pre-processing data (feature engineering)
– Allows to add new data points to the model
Slides: https://1drv.ms/f/s!AsXSX3Q3cMlAjrcTdvNFSpIgHfqAcA
Top Related