WiseNet Performing Highly Accurate Predictions Through ... › ~alanwags › DLAI2016 ›...

Post on 28-Jun-2020

0 views 0 download

Transcript of WiseNet Performing Highly Accurate Predictions Through ... › ~alanwags › DLAI2016 ›...

WiseNetPerforming Highly Accurate Predictions

Through Convolutional Networks for Actual Telecommunication Challenges

Jaime Zaratiegui, Ana Montoro & Federico CastanedoData Science Team

IJCAI 2016 Workshop on

Deep Learning for Artificial Intelligence (DLAI)

●..

Outline

1. Introduction2. Data Sets3. Data Sources & Representation4. Data Normalization5. Network Architecture6. Experimental Results7. Generalization & Feature Model Results8. t-SNE dimensionality reduction9. Summary

IntroductionAt Wise Athena we develop predictive models for vertical Industries: Telecom & Consumer Packaged Goods (CPGs).

We present the application of ConvNets trained with GPUs on the non-trivial problem of predicting customer churn in prepaid telecom.In this business scenario, we define churn as an inactvity periodfor a balance replenishment event (i.e. 28 days).

This is a long standing problem that has been traditionally approachedby training ML classifiers with hand-crafted features (human expensive).

Data SetsOur model learns from structured data commonly found in the telecom industry. From real data (1.2M users) with 22% of base churn ratio we generated the following sets:

● 130k users (supersampled with different time offsets)● 102k train● 18k validation● 37k test

In order to validate the generalization performance of WiseNet, we have prepared a second test data drawn from a different country.

Data Sources & Representation

● Mobile Originating Calls MOC● Mobile Terminating Calls MTC● Topups TU

The x-dimension of each pixel corresponds to a 2-hour period

Start of the week mark

Data NormalizationMOC MTC

Power law:Intensity = (Fill fraction)^(1/7)

As most variance is in the low fill fraction. Intensity saturates at a certain cutoff.

Topups

Network Architecture

Two convolutional layers and three dense. Trained with GPUs.

Experimental ResultsWiseNet outperforms all other ML algorithms studied.

AUC Log-loss TP5 Brier

WiseNet 0.8787 0.4274 0.8929 0.1383

xgboost 0.8561 0.4722 0.8908 0.1594

GBM 0.8512 0.4995 0.8750 0.1662

GLM 0.8228 0.6782 0.7592 0.2433

randomForest 0.8169 1.2482 0.8636 0.2018

Experimental Results/comparison

WiseNet xgboost

Generalization & Feature Model Results

WiseNet AUC Log-loss TP5 Brier

Market-1 0.8787 0.4274 0.8929 0.1383

Market-2 0.8788 0.4449 0.9163 0.1428

AUC Log-loss TP5 Brier

WiseNet 0.8787 0.4274 0.8929 0.1383

Feature Model 0.8552 0.4602 0.7184 0.1528

Generalization

Feature Model comparison

t-SNE Dimensionality ReductionUsing the states of the 1024-neuron dense layer we have performed at-SNE over a random selection of 26k users.

Summary● Novel method to encode customer behavior into images that allows

using ConvNets on structured data.

● An experimental evalution with differet ML models, showing the capability of WiseNet to learn features.

● A comparison with a production model developed using hand-crafted features, showing the advantage of WiseNet.

● State the generalization and transfer learning property by applying the same model on a different market (without retraining).

Thanks!!!Www.wiseathena.com