Download - QCon London · r bingo ! I 2 1 1 0 1 1 0 0 1 1 •SVD •Word2vec word f1 f2 f3 f4 f5 dog 0.8 0.2 0.1-0.1-0.4 this -0.2 0 0.1 0.8 0.3 love 0.1 0.2 0.7-0.50.1 cute 0.2 0.2 0.8-0.4

QCon London H2O Driverless AI

An AI that creates AI!

Marios Michailidis

Background• Competitive data scientist at H2O.ai• PhD in ensemble methods at UCL• Former kaggle #1 – over 120+ competitions

Features Target

Data Quality andTransformation

ModelingTable

ModelBuilding

ModelData Integration

+

Challenges in the Machine Learning Workflow

Weeks or even Months per Model Optimization

Highly Iterative Process

• Insight – Visualization• Cross Validation• Feature Engineering • Model Selection• Hyper Parameter Optimization• Feature Selection• Ensemble• Understanding/Interpreting the results• Deploy/Productionize

Confidential4

Visualizations

outlier detection Correlation graph Heat map

Confidential5

AnimalDogDogCatFishCatDogFishCatCatFish

L.Encoding2213123113

Frequency3343433443

Is_Dog? Is_Cat? Is_Fish?1 0 01 0 00 1 00 0 10 1 01 0 00 0 10 1 00 1 00 0 1

Cost10015030050250752540022540

Animal Avg.costCat 293.75Dog 108.333Fish 38.3333

108.33108.33293.7538.33293.75108.3338.33293.75293.7538.33

Feature engineering: Categorical features

Confidential6

Age404545454951566162717374777878

Age40-5152-5657-7778+

Age40454545

missing51566162717374777878

Age4045454561.1451566162717374777878

Age40-5152-5657-7778+

missing

Age40-5140-5140-5140-5140-5140-5152-5657-7757-7757-7757-7757-7757-7778+78+

Age40-5140-5140-5140-51missing40-5152-5657-7757-7757-7757-7757-7757-7778+78+

Mean=61.14

Feature engineering: Numerical features

Age

Income

Confidential7

Ageincom

e40 4045 1445 10045 3349 2051 4456 6561 8062 5571 1573 1874 3377 3278 4878 41

* +1600 80630 594500 1451485 78980 692244 953640 1214880 1413410 1171065 861314 912442 1072464 1093744 1263198 119

Animal colorDog brownDog whiteCat blackFish goldCat mixedDog blackFish whiteCat whiteCat blackFish brown

Animal_colorDog_brownDog_whiteCat_blackFish_goldCat_mixedDog_blackFish_whiteCat_whiteCat_blackFish_brown

Age Animal13 Dog12 Dog15 Cat3 Fish16 Cat10 Dog6 Fish20 Cat13 Cat2 Fish

derived

11.6611.66163.661611.663.6616163.66

Animal AgeDog 11.66Cat 16Fish 3.66

Feature engineering: Interactions

Confidential8

This dog is cute! I love dogs

dog this is jedi love cuteempero

r bingo ! I2 1 1 0 1 1 0 0 1 1

• SVD• Word2vec

word f1 f2 f3 f4 f5dog 0.8 0.2 0.1 -0.1 -0.4this -0.2 0 0.1 0.8 0.3love 0.1 0.2 0.7 -0.5 0.1cute 0.2 0.2 0.8 -0.4 0

King – Man =Queen

Feature engineering: Text

Confidential9

Date1/1/20182/1/20183/1/20184/1/20185/1/20186/1/20187/1/20188/1/20189/1/201810/1/2018

Day Month Yearweekda

y weeknum1 1 2018 2 12 1 2018 3 13 1 2018 4 14 1 2018 5 15 1 2018 6 16 1 2018 7 17 1 2018 1 28 1 2018 2 29 1 2018 3 210 1 2018 4 2

Date Sales1/1/2018 1002/1/2018 1503/1/2018 1604/1/2018 2005/1/2018 2106/1/2018 1507/1/2018 1608/1/2018 1209/1/2018 8010/1/2018 70

Lag1 Lag2- -100 -150 100160 150200 160210 200150 210160 150120 16080 120

MA2-100125155180205180155140100

Feature engineering: Time series

Confidential10

Common open source packages used in ML

LIBLINEARLIBFFMLIBSVM

Confidential11

• Optimizing maximum depth• Tree expansion (loss vs depth)• Learning rate• Number of trees

1

2

3 3

2

Hyper parameter tuning

Copyright 2018 H2O.ai Inc. All Rights Reserved.

Validation approach

Train dataPast Future

x0 x1 x2 x3 y0.94 0.27 0.80 0.34 10.02 0.22 0.17 0.84 00.83 0.11 0.23 0.42 10.74 0.26 0.03 0.41 00.08 0.29 0.76 0.37 00.71 0.76 0.43 0.95 10.08 0.72 0.97 0.04 00.84 0.79 0.89 0.05 1

0.94 0.27 0.80 0.34 10.02 0.22 0.17 0.84 00.83 0.11 0.23 0.42 10.74 0.26 0.03 0.41 00.08 0.29 0.76 0.37 00.71 0.76 0.43 0.95 10.08 0.72 0.97 0.04 00.84 0.79 0.89 0.05 1

K=4

pred0.000.000.000.000.000.000.000.00

Fold : 1

pred0.960.030.000.000.000.000.000.00

Train

Predict pred0.960.030.900.120.000.000.000.00

Fold : 2Fold : 3

pred0.960.030.900.120.030.770.000.00

Fold : 4

pred0.960.030.900.120.030.770.180.91

Confidential13

Genetic algorithm approach

x1 x2 x3 x4 y0.14 0.69 0.01 0.71 10.22 0.44 0.45 0.69 10.12 0.35 0.51 0.23 00.22 0.42 0.79 0.60 10.93 0.82 0.72 0.50 10.32 0.58 0.28 0.22 00.95 0.59 0.68 0.09 10.34 0.58 0.35 0.81 00.05 0.80 0.28 0.86 10.23 0.49 0.63 0.03 00.05 0.34 0.53 0.73 10.74 0.02 0.33 0.56 0

Iteration1/10

X% accuracyFeature Importancex4 1x2 0.5x3 0.2x1 0.01

Tuned

Iteration2/10

+ Random Transformation

sqrt(x2)

Z% accuracyFeature Importancex2+x4 1x4 0.4

sqrt(x2) 0.3x3 0.2x2 0.05

Confidential14

Feature ranking based on permutations

0.5 0.52 0.69 0.2 10.18 0.94 0.57 0.68 00.67 0.6 0.76 0.91 10.25 0.68 0.57 0.07 10.83 0.07 0.76 0.56 00.49 0.92 0.64 0.02 0

0.17 0.7 0.57 10.96 0.75 0.59 10.96 0.14 0.09 00.61 0.38 0.07 00.07 0.13 0.87 10.66 0.27 0.48 1

0.410.560.210.610.260.02

0.020.560.210.410.610.26

Train

Validation

fit

score

80% accuracy 70% accuracy (drop)

Shuffle The 10% difference is how important that feature is

Bring back to normalRepeat for all features

Confidential15

StackingX0 x1 x2 xn y0.17 0.25 0.93 0.79 10.35 0.61 0.93 0.57 00.44 0.59 0.56 0.46 00.37 0.43 0.74 0.28 10.96 0.07 0.57 0.01 1

AX0 x1 x2 xn y0.89 0.72 0.50 0.66 00.58 0.71 0.92 0.27 10.10 0.35 0.27 0.37 00.47 0.68 0.30 0.98 00.39 0.53 0.59 0.18 1

BX0 x1 x2 xn y0.29 0.77 0.05 0.09 ?0.38 0.66 0.42 0.91 ?0.72 0.66 0.92 0.11 ?0.70 0.37 0.91 0.17 ?0.59 0.98 0.93 0.65 ?

C

pred0 pred1 pred2 y0.24 0.72 0.70 00.95 0.25 0.22 10.64 0.80 0.96 00.89 0.58 0.52 00.11 0.20 0.93 1

B1pred0 pred1 pred2 y0.50 0.50 0.39 ?0.62 0.59 0.46 ?0.22 0.31 0.54 ?0.90 0.47 0.09 ?0.20 0.09 0.61 ?

C1

Train algorithm 0 on A and make predictions for B and C and save to B1, C1 Train algorithm 1 on A and make predictions for B and C and save to B1, C1 Train algorithm 2 on A and make predictions for B and C and save to B1, C1

Train algorithm 3 on B1 and make predictions for C1

Preds30.450.230.990.340.05

Consider datasets A,B,C. Target variable (y) is known for A,B

Confidential16

“Theabilitytoexplainortopresentinunderstandabletermstoahuman.”– Finale Doshi-Velez and Been Kim. “Towards a rigorous science of interpretable

machine learning.” arXiv preprint. 2017. https://arxiv.org/pdf/1702.08608.pdf

FAT*: https://www.fatml.org/resources/principles-for-accountable-algorithms

XAI: https://www.darpa.mil/program/explainable-artificial-intelligence

Machine Learning Interpretability?

https://arxiv.org/pdf/1702.08608.pdf

https://www.fatml.org/resources/principles-for-accountable-algorithms

https://www.darpa.mil/program/explainable-artificial-intelligence

Confidential17

Interpretability and accuracy trade-off

Linear Models

Exact explanations forapproximate models.

Machine Learning

Approximate explanations for exact models.

Confidential18

DecisionTreeSurrogateModelsPartialDependenceandIndividualConditionalExpectation

https://github.com/jphall663/interpretable_machine_learning_with_python

Machine Learning Interpretability examples

https://github.com/jphall663/interpretable_machine_learning_with_python

H2O.ai Product Suite

Automatic feature engineering,machine learning and interpretability

• 100% open source – Apache V2 licensed• Built for data scientists – interface using R, Python on H2O Flow

(interactive notebook interface)• Enterprise support subscriptions

• Enterprise software• Built for domain users, analysts and

data scientists – GUI-based interface for end-to-end data science

• Fully automated machine learning from ingest to deployment

H2O AI open source engine integration with Spark

Lightning fast machine learning on GPUs

In-memory, distributed machine learning algorithms

with H2O Flow GUI

Open Source

Confidential20

H2O is the open source leader in AI.

Democratize AI for Everyone.

AI for good.

Confidential21

Will there be a default?

AUC, Accuracy, Precision…Time, hardware

Insight-visualization

Feature Engineering

Predictions

Model Interpretability

• Some input data• A target variable• An objective (or a success metric)• Some allocated resources

H2O Driverless AI process

x1 x2 x3 x4 y0.14 0.69 0.01 0.71 10.22 0.44 0.45 0.69 10.12 0.35 0.51 0.23 00.22 0.42 0.79 0.60 10.93 0.82 0.72 0.50 10.32 0.58 0.28 0.22 00.95 0.59 0.68 0.09 10.34 0.58 0.35 0.81 00.05 0.80 0.28 0.86 10.23 0.49 0.63 0.03 00.05 0.34 0.53 0.73 10.74 0.02 0.33 0.56 0

Confidential22

Scoring Pipeline• Python

• MOJO (Java)

Confidential23

Demo

Confidential24

Thank you

[email protected]/mariosmichailidis/

@StackNet_https://www.facebook.com/StackNet/

mailto:[email protected]

https://www.facebook.com/StackNet/