Machine Learning Explained and how apply lean startup to develop a MVP tool

Post on 25-Jan-2017

345 views 2 download

Transcript of Machine Learning Explained and how apply lean startup to develop a MVP tool

Imagine you wish to predict the quality of any bananas at your will? With Machine Learning this is possible. The first step is to acquire a large sample of bananas, assess their characteristics, and use them to create a large dataset. From this dataset, you determine which features (eg. colour, size, weight, shape, area grown etc) are the most important ones for predicting the quality of each banana. This process, called Feature Engineering, provides a set of input variables. Secondly, you may decide to that the measures of quality you are wish to predict are sweetness, softness, and storage life. These are called output variables. The task of the machine learning algorithm is to predict the output variables, based on the input variables.

To develop the machine learning model, we split the dataset into two groups: a Learning set (around 90% containing both input and output variables) and a smaller Validation dataset (around 10% also containing both input and output variables).

Using the larger Learning dataset only, we start to “train” the machine learning algorithm by feeding it both the input and output variables. The algorithm uses internal rules (or parameters) to predict the output based on the input, and adjusts them each time it makes a mistake (predicts the wrong output value). This allows the algorithm to start to experience the data and learn how the input variables impact the output variables. It begins to create its own framework of how it views bananas. This framework models the link between a typical banana's physical characteristics (input), and its quality (output).

After training, we must test the models accuracy. To do this, we use the remaining Validation dataset and hide the answers (output) from the algorithm. This way we can assess the algorithm’s accuracy on data in which we know the answers, but the algorithm does not. Hence, we ask the model to predict the output and compare its answers (output) to the true ones.

What's more, the algorithm’s prediction accuracy improves as more data becomes available; it continues to modify itself and gets better. The machine learns!

Case study on Machine Learning. Lets talk Bananas…

Got questions or want to learn more? Contact franki@hivery.com Page 1

STEP

WHA

T

Dataset

CCA/CCSP promotional effectiveness“historical” dataset is received

Data is split into “Training set” (90%) & “Validation set” (10%)

Learnt Model

The models “parameters” or demand signals get adjusted so it progressively gets better at predicting.

We also "Feature engineering” the model to help it understand the most important “features” of “promotional effectiveness” data it needs to learn.

Training set

Using the training data set, we show the model the ‘answers’ within the data so it learns

E.g. When we ran a promotion Y, during time Z, the result was X

Validation set

We now test the model using validation set but hide the “answers” by asking the model to predict the “answers”. We compare model’s predictions with the hidden answers to determine accuracy

E.g. If we ran a promotion Y, during time Z, what will be the result?

Idea Model

Once the model is predicting with high degree of accuracy, we are ready live ‘market’ data

55545251 53

Got questions or want to learn more? Contact franki@hivery.com Page 2

Machine Learning , a subset of Artificial Intelligence, is the science that involves developing self-learning algorithms. The "learning" part of machine learning is an algorithm that optimizes predictive accuracy through “training” and “validation”

Step by step flow of machine learning

Complete dataset Complete datasetSplit into two dataset to train model

High degree predicting model

DeploymentExperiment

STEP

MVP

Once the experiment has validated ROI, we proceed to develop a MVP tool (i.e. Idea Model + UX interface) to allow end-users to interact with the model.

Experiment

We apply our freshly-developed model to the real-world data and assess the results/business impact. We continue to refine the parameters.

Product

The the product, often called “Beta Product” because it’s the first version is constantly refined and improved based on user and business (i.e. security) feedback and needs

WHA

T

Dataset

Source dataset

Data is split into “Training set” (90%) & “Validation set” (10%)

Learnt Model

The models “parameters” or demand signals get adjusted so it progressively gets better at predicting.

We also "Feature engineering” the model to help it understand the most important “features” of “emailing” data it needs to learn.

Training set

Using the training data set, we show the model the ‘answers’ within the data so it learns

E.g. This is spam email looks like, this is not spam email

Validation set

We now test the model using validation set but hide the “answers” by asking the model to predict the “answers”. We compare model’s predictions with the hidden answers to determine accuracy

E.g. Is this email spam or not?

Idea Model

Once the model is predicting with high degree of accuracy, we are ready live ‘market’ data

HIVE

RY

Using HIVERY’s Discovery, Experiment and Deployment methodology, a product development cycle is added once the model has been validated (step 5), where we experiment (test the model) and build an MVP that will allows business users ongoing use of the model in simple yet powerful

interface

From Machine Learning to custom Product Solutions

Discovery

55545251 53 56 57 58

Got questions or want to learn more? Contact franki@hivery.com Page 3