ADVANCED ANALYTICS, THE MODERN …...2016 Dataiku, Inc. contactdataiku.com dataiku 1 ADVANCED...

32
ADVANCED ANALYTICS, THE MODERN MARKETER’S BEST FRIEND Model-Based Segmentation for Better Marketing WHITE PAPER www.dataiku.com

Transcript of ADVANCED ANALYTICS, THE MODERN …...2016 Dataiku, Inc. contactdataiku.com dataiku 1 ADVANCED...

Page 1: ADVANCED ANALYTICS, THE MODERN …...2016 Dataiku, Inc. contactdataiku.com dataiku 1 ADVANCED ANALYTICS, THE MODERN MARKETER’S BEST FRIEND Model-Based Segmentation for2 2016 Dataiku,

1©2016 Dataiku, Inc. | www.dataiku.com | [email protected] | @dataiku

ADVANCED ANALYTICS, THE MODERN MARKETER’S BEST FRIENDModel-Based Segmentation for Better Marketing

W H I T E P A P E R

www.dataiku.com

Page 2: ADVANCED ANALYTICS, THE MODERN …...2016 Dataiku, Inc. contactdataiku.com dataiku 1 ADVANCED ANALYTICS, THE MODERN MARKETER’S BEST FRIEND Model-Based Segmentation for2 2016 Dataiku,

2 ©2016 Dataiku, Inc. | www.dataiku.com | [email protected] | @dataiku

INTRODUCTION

TRADITIONAL SEGMENTATION: HOW? WHAT? WHY?

How is Traditional Marketing Segmentation Done Today?

Drawbacks of Traditional Segmentation: Fixed Categories Means Limited Scope

MODEL-BASED SEGMENTATION: HOW? WHAT? WHY?

User Data: A Plethora Of Data Points To Paint a Complete Picture

Sonra Testimonial: «Going Beyond Google Analytics»

Machine Learning: Let’s Make Some Predictions

Capgemini Consulting Testimonial: «A Story of Segmentation»

LET’S GET TECHNICAL: Algoryhtmes are your friends

LET’S GET TECHNICAL: ready, set... Target!

Saegus Testimonials: «5 Guidelines for Data-driven Marketing»

MODEL-BASED SEGMENTATION IN THE REAL WORLD

How Can the IoT Industry Ensure Subscriber Retention and Loyalty ?

How Can an Online Retailer Reduce its Churn?

How Can Segmentation Allow for Targeted Recommendation in the Travel Industry?

How Can the Banking Industry Improve its Customers’ Segmentation?

LET’S GET TECHNICAL: Churn, a Story About Love, or Lack Thereof

WebbMason Testimonial: «Churn Case Study»

CONCLUSION

ABOUT DATAIKU

S U M M A R Y3

4

5

6

9

11

13

14

17

20

16

22

23

24

25

26

27

28

30

31

32

Page 3: ADVANCED ANALYTICS, THE MODERN …...2016 Dataiku, Inc. contactdataiku.com dataiku 1 ADVANCED ANALYTICS, THE MODERN MARKETER’S BEST FRIEND Model-Based Segmentation for2 2016 Dataiku,

3©2016 Dataiku, Inc. | www.dataiku.com | [email protected] | @dataiku

I N T R OD U C T I O N

With the rise of the digital ecosystem, marketing analytics seems to have a bright future. The intersection of marketing and analytics has enabled teams to adopt a more customer-centric approach. Examples range from using specific offers to retain existing customers, delivering highly-targeted offers, serving targeted content to prospects, using payment network partnerships to facilitate the delivery of time & location-sensitive offers, and much more.

Realizing all of these goals hinges on customer knowledge. Without inputs on who customers are and how they behave, organizations have no insight on how to leverage them. This is, after all, the Age of the Customer, where consumers are the driving force behind business decisions. Customers no longer blindly accept what’s offered to them — self-education now precedes purchasing decisions. This has forced marketers to re-think how they reach potential customers at all phases of the buyer journey.

Knowing customers is not a new idea, but the concept has evolved in our modern data-driven environment. Customer insights are now the province of Big Data, where consumer behavior, actions, and trends lie hidden in vast quantities of heterogeneous data. Knowing and segmenting your customers is truly a data problem: how do you get marketers to drive their campaigns based on data rather than on gut feeling?

In this whitepaper, we will discuss how advanced analytics have the potential to transform the ways in which segmentation for marketing purposes is accomplished. We’ll start with a look at traditional segmentation methods and then move on to exploring how advanced analytics (model-based segmentation) can change the game. Then we’ll explore a few marketing & analytics use cases in various industries. Lastly, we’ll examine the methodologies needed to implement model-based segmentation in the real world.

Advanced data analytics is a game-changer for marketing organizations. A McKinsey DataMatic study showed that firms in the top quartile of analytics performance were 20 times better at attracting new customers and more than 5 times better at retaining existing customers.1

Page 4: ADVANCED ANALYTICS, THE MODERN …...2016 Dataiku, Inc. contactdataiku.com dataiku 1 ADVANCED ANALYTICS, THE MODERN MARKETER’S BEST FRIEND Model-Based Segmentation for2 2016 Dataiku,

4 ©2016 Dataiku, Inc. | www.dataiku.com | [email protected] | @dataiku

T R A D I T I O N A L S E G M E N T A T I O N : H O W ? W H A T ? W H Y ? _A- How is Traditional Marketing Segmentation Done Today?

B- Drawbacks of Traditional Segmentation: Fixed Categories Means Limited Scope

Page 5: ADVANCED ANALYTICS, THE MODERN …...2016 Dataiku, Inc. contactdataiku.com dataiku 1 ADVANCED ANALYTICS, THE MODERN MARKETER’S BEST FRIEND Model-Based Segmentation for2 2016 Dataiku,

5©2016 Dataiku, Inc. | www.dataiku.com | [email protected] | @dataiku

A - H O W I S T R A D I T I O N A L M A R K E T I N G S E G M E N T A T I O N D O N E T O D A Y ?

Traditional Segmentation: How? What? Why?_

Segmentation is a marketing process that consists of dividing a broad target market into categories (consumers, businesses, etc.) that typically would have common interests, traits, or needs. Marketers segment user or customer databases to link a product and/or service to a specific target with the right messaging and / or dedicated campaigns. Typically, traditional segmentation falls into four categories: geographic segmentation; behavioral segmentation; demographic segmentation; psychographic segmentation.

Ex: What do we know about customer Andrew Smith and how can a marketing department use this information to increase probability of purchase or interaction?

Based on this information, a marketing department could: Send Andrew an email on his personal account advertising a special golf bag sale in Palo Alto.This tactic for targeting Andrew is based on information from traditional segmentation.

FIRST AND LAST NAME: Andrew SmithGEOGRAPHIC: From Andrew’s IP Address (70.197.3.33), we can infer that Andrew lives in Palo Alto, California, area code 74306.DEMOGRAPHIC: Based on Andrew’s date of birth and self declared origin and gender, we know that, as of today, Andrew is a 29 year old caucasian male.PSYCHOGRAPHIC: Andrew is single and enjoys golf on weekends.BEHAVIORAL: Andrew has a high click rate on emails he receives on his personal account.

2- Behavioural Segmentation1- Geographic Segmentation

Geographic segmentation is the process of categorizing customer groups based on physical location. These groups can be increasingly refined from large geographic areas (country, region) to narrower geographic areas (states, neighborhoods). To do this, most companies track IP addresses or rely on self-reported geographic location of customers.

The information that marketing teams use for demographic segmentation, which creates sub-groups based on age, gender, family size, income, education, religion, race, nationality, and so on, is usually self-declared by customers when signing up to a product or infered from user location.

Psychographic segmentation is based on different personality traits, interests, lifestyles, values, and attitudes. This segmentation essentially uses human psychology as a filter mechanism for dividing customers into more refined segments. Just like demographic segmentation, companies mostly infer these details from consumer self-reported statements during sign-up.

Behavioral segmentation divides markets into groups based on the responses, attitudes, knowledge, and use of specific products or services. This approach factors in the actual product or service in an attempt to understand & anticipate customer decision-making. This segmentation takes into account how often the product is used (usage rate or frequency) and usage situation (daily use, holiday use, etc.). To do this, most companies track consumers’ behaviour online thanks to what are called “cookies” (ie. a file that is automatically added to a user’s computer that sends information about this user to the owner of the cookie).

3- Demographic Segmentation 4- Psychographic Segmentation

Page 6: ADVANCED ANALYTICS, THE MODERN …...2016 Dataiku, Inc. contactdataiku.com dataiku 1 ADVANCED ANALYTICS, THE MODERN MARKETER’S BEST FRIEND Model-Based Segmentation for2 2016 Dataiku,

6 ©2016 Dataiku, Inc. | www.dataiku.com | [email protected] | @dataiku

B-D RA WBACKS OF TRADITIONAL SEGMENTATION: F IX ED CATEGORIES MEANS L IMITED SCOPE

Traditional Segmentation: How? What? Why?_

The effectiveness of traditional segmentation is limited by its reliance on fixed methodologies. With traditional segmentation, customer knowledge is molded upon rigid, and often outdated, categories, an approach that often leads marketers astray when attempting to optimize campaigns to specific targets. A few key limitations include:

Sub-category Branching: An Incomplete Picture

Segmentation methodologies usually have a hierarchical structure with sub-categories branching downwards. For example, let’s segment a fictional database based on gender, age, and purchase:

- ¾ of a customer database is female. - Out of this group of females, 75% is between the ages of 45 and 60. - Out of this sub-group of females between the ages of 45 and 60, 2% has bought pink t-shirts in the past 12 months.

But, out of these 2%, some have also bought a blue t-shirt in the past 12 months. Based on this top down approach, we’ve ignored a factor (purchase blue or pink t-shirt) that may have an important impact on the reaction of the customer to a specific campaign.

Lack of Behavioral Scope: Poor Data Means Poor Insight

Traditional segmentation often relies on data points from consumer self-reporting that is often gathered in a limited, prescribed format. This approach makes it difficult to understand and group customers on a deeper level, possibly leaving important data obscured.

This lack of scope results in little true data about the behavioral tendencies and affinities within a consumer group. Plus, this approach leaves no room to learn more from the customer as his or her relationship with the brand / product / service evolves over time.

Small Sample Sizes

Data used for traditional segmentation methodologies typically involves surveys, focus groups, and sales data. The problem with this approach is size. Indeed, data from such sources is limited in size when compared to the potential of Big Data (i.e. data that is automatically retrieved from online behavior, GPS data, etc).

Though Big Data may seem to contain more useless information than actionable insights, it is not in the individual tracking of a person that brings about the value, but rather the trends that are discovered by analysing this plethora of data together into one complete picture.

Page 7: ADVANCED ANALYTICS, THE MODERN …...2016 Dataiku, Inc. contactdataiku.com dataiku 1 ADVANCED ANALYTICS, THE MODERN MARKETER’S BEST FRIEND Model-Based Segmentation for2 2016 Dataiku,

7©2016 Dataiku, Inc. | www.dataiku.com | [email protected] | @dataiku

Smaller sample sizes often return results that do not reflect the finer nuances of customer behavior; discovering trends - if any are to be discovered - in small samples is not as relevant.

Siloed Data Sources

Traditional customer segmentation lacks the flexibility to gather and evaluate data from multiple sources, such as an organization’s CRM, e-mail, and social media data. Indeed, in this day and age, customers no longer interact with only one single source or channel. Therefore, mixing and merging these different sources is essential to create a global picture of who your brand is interacting with and how these individuals are interacting with your brand.

Data Lifespan... And its LimitsThese traditional segments use fixed data collection rules, such as “What is the customer’s gender?,” “What is the customer’s income?,” and “Has the customer replied to the last e-mail campaign?” Granted, this is better than nothing. But in today’s data-centric world you need access to data that may change quickly.

Segmentation-derived data may be updated on an annual basis and is far too restrictive when trying to understand the complexities of your customer base. A customer may buy a blue t-shirt in January 2015 but has begun to buy pink t-shirts in June 2016. If my segmentation is done based on old data (ie. the blue t-shirt she bought in January 2015), I’ll continue pushing “blue t-shirt” campaigns her way and therefore miss out on selling her the pink t-shirts or related products.

Marketers relying on traditional segmentation typically limit the effective lifespan of data because they are unable to act upon data after performing segmentation. Traditional methodology cannot leverage dynamic data, unlike analytics which is capable of running and re-running models repeatedly in order to explore different permutations for marketing purposes.

Firms Use Only A Fraction Or Their Enterprise data « Please estimate the pourcentage of the total volume of your company data that you use for business intelligence. »

Source: Forrester’s Global Business Technographics® Data and Analytics Survey, 2015

Page 8: ADVANCED ANALYTICS, THE MODERN …...2016 Dataiku, Inc. contactdataiku.com dataiku 1 ADVANCED ANALYTICS, THE MODERN MARKETER’S BEST FRIEND Model-Based Segmentation for2 2016 Dataiku,

8 ©2016 Dataiku, Inc. | www.dataiku.com | [email protected] | @dataiku

The old adage, “Time and tide wait for no man” can be applied to technology as well.

Our ability to harness and use technology is often outpaced by the technology itself which, more often than not, renders our modern inventions useless before they can even become popular. Traditional marketing segmentation is a relic of the past, yet many marketers cling to methods that are familiar.

The Age of the Customer, however, involves large datasets capable of painting highly detailed customer profiles. It is an age that has left traditional segmentation in its wake and has its eyes on a new approach: model-based segmentation.

O U T W I T H T H E O L D

Page 9: ADVANCED ANALYTICS, THE MODERN …...2016 Dataiku, Inc. contactdataiku.com dataiku 1 ADVANCED ANALYTICS, THE MODERN MARKETER’S BEST FRIEND Model-Based Segmentation for2 2016 Dataiku,

9©2016 Dataiku, Inc. | www.dataiku.com | [email protected] | @dataiku

M O D E L- B A S E D S E G M E N T A T I O N : H O W ? W H A T ? W H Y ? _A- User Data: A Plethora Of Data Points To Paint a Complete Picture

B- Machine Learning: Let’s Make Some Predictions

☑ LET’S GET TECHNICAL: Algorithms are your Friends

☑ LET’S GET TECHNICAL: Ready, Set… Target!

Page 10: ADVANCED ANALYTICS, THE MODERN …...2016 Dataiku, Inc. contactdataiku.com dataiku 1 ADVANCED ANALYTICS, THE MODERN MARKETER’S BEST FRIEND Model-Based Segmentation for2 2016 Dataiku,

10 ©2016 Dataiku, Inc. | www.dataiku.com | [email protected] | @dataiku

Basically, model-based segmentation is the art of creating dynamic segments of users or customers based on interactions between a huge diversity of data points. The data world is entering a new age. We are now in a new data space where both the quantity & quality of customer data is increasing in type, complexity, diversity, velocity, and interdependence. Data has never been more complex. Fifteen years ago, no one would have dreamed of keeping track of customer Web logs or tracking a buyer’s online purchasing frequency to optimize marketing campaigns. These new sources of highly specific information provide a more detailed and ever evolving opportunity to intelligently and dynamically segment customer databases. And guess what, marketing teams themselves now can!

In this section we will have a closer look at the “What?” and “How?” behind model-based segmentation:

What Kind of Data is Now Available? Broadly-speaking, there are three types of data (transaction, interaction, and external) that, when combined, provide a holistic view of “User Data.” Understanding user data, and its representative elements, will not only help us to define where user information is coming from and how its complexity has increased over the last decade, but moreover how it can be used to dynamically segment user databases.

How is User Data Analyzed? Making sense and predicting future trends in user data is made possible by what is commonly referred to as advanced analytics. Usually, advanced analytics include what is called machine learning. Machine learning is a subfield of computer science that gives computers the ability to detect patterns in data and learn from them. With this information, computers can compute predictions that reflect future trends. Compared to the process of traditional segmentation to make sense of data, advanced analytics empower companies to explore data on a much more granular and dynamic level. Advanced modeling, testing, and visualization techniques combine to provide detailed predictive insights into customer behavior.

To illustrate the advantages brought about by model-based segmentation for marketing purposes, we will use a fictional company called “Fashion Clothing Store.”

Fashion Clothing Store is a clothing retailer founded by sisters Marie and Marge in 1983. By 1995, Fashion Clothing Store had over 150 boutiques throughout the United States. In 1999, the sisters created Fashionclothingworld.com which enabled customers to buy the clothing from their brand online. In 2014, Fashion Clothing Store hired their first data scientist to help them make sense of and get value from their online and offline accumulated data.

Page 11: ADVANCED ANALYTICS, THE MODERN …...2016 Dataiku, Inc. contactdataiku.com dataiku 1 ADVANCED ANALYTICS, THE MODERN MARKETER’S BEST FRIEND Model-Based Segmentation for2 2016 Dataiku,

11©2016 Dataiku, Inc. | www.dataiku.com | [email protected] | @dataiku

A - U S E R D A T A : A P L E T H O R A O F D A T A P O I N T S T O P A I N T A C O M P L E T E P I C T U R E

1- Transaction Data: A Peek Into Your Customer’s Buying HabitsTransactional data is one of the oldest data types and reflects a wide variety of customer-centric data, such as time, location, price, payment methods, discount values, quantity purchased, etc. All of this data can be combined to convey a precise picture of customer shopping habits and interests.

In the 1980s, consumer-to-business communication channels were limited to a handful of options, such as customer’s written feedback, complaint letters, etc. In today’s age, the number and variety of consumer channels have grown significantly. The digital era now enables companies to follow customers and prospects on all channels, whether it’s website interactions, social media, email, phone conversations, or text messages. An almost one-to-one relationship can be created and, when combined with other points of interaction, a global customer view can be formulated.

2- Interaction Data: From Snapshot to Screenplay

3 - External Data: From Tunnel Vision to Global Perspective

Note that the answers to these questions become further data points, thus continuously expanding the possibilities for database segmentation

In the past, Fashion Clothing Store experienced a disconnect between online credit card transactions and real-world in-store transactions. There was no way to follow a customer’s transactional data, particularly if they made purchases in different retail locations. Implementing advanced analytics

into their strategy, however, enabled Fashion Clothing Store to follow their customers online and link them to real-world behaviors. Website logs and in-store payment data can be linked and analyzed, enabling Fashion Clothing Store to realize a more complete view of their customers’ transactions.

Keeping a close look at social networks, tracking weblogs, and recording online surveys, Fashion Clothing Store marketing teams now have the capability to infer rich information about their customers, such as:

Fashion Clothing Store can now access a mine of useful information from publicly-available sources. Geographic, demographic, and sociological data can be woven together, as needed, to paint a picture of expected customer behavior. With this plethora of rich information at hand, Fashion Clothing Store can now gain insight and make better business decisions by answering questions such as:

• How do customers feel about the brand?• How do they usually express their opinions? Via what channels?• Where do users click on the website the most?

• Where should they open the newest locations of their physical stores?• How can they optimize staffing based on customer exposure during specific times?

• How much time do they spend on the website before actually making a purchase?• Are customer satisfaction rates the same in-store and online?

• How can marketing campaigns reflect current social media trends?• What fashions should be advertised in what cities depending on sociodemographics and market trends?

External data is defined as all data outside of an organization’s internal operating systems. Historically, this type of data was hampered by the traditional segmentation approach: external data was limited and, when available, only data that fit within the confines of segmentation rules was considered (e.g., average age group, interests filtered by location, etc.). The overall approach was not broad and the results were limited.Thanks to advances in analytical processing, coupled with expanded data availability (e.g., Open Data Initiatives), organizations can now tap into a variety of dimensional data in order to add layers of meaning to customer behavior. For example, geographic and socio-demographic datasets can be used to provide deep customer insights: how will traffic congestion in a specific area affect retail outlet visits? How will the weather affect foot traffic for outdoor locations?

Page 12: ADVANCED ANALYTICS, THE MODERN …...2016 Dataiku, Inc. contactdataiku.com dataiku 1 ADVANCED ANALYTICS, THE MODERN MARKETER’S BEST FRIEND Model-Based Segmentation for2 2016 Dataiku,

12 ©2016 Dataiku, Inc. | www.dataiku.com | [email protected] | @dataiku

C R E A T I N G A G L O B A L C U S T O M E R I M A G E

M E A N W H I L E , A T F A S H I O N C L O T H I N G S T O R E

These three data types, when combined, create a global image of your customer. So what does that mean? It means that companies now have the capability to truly understand their customers: how do they pay for goods? What do they like on social media? What is the traffic like when they typically visit your store? Model segmentation data breaks the mold of older techniques by collecting data directly from the customer instead of relying on marketers to frame data questions ahead-of-time. The verdict is in: traditional segmentation is antiquated — the new black is granular model-based segmentation made possible by implementing advanced analytics methodologies and technologies.

User Data: the aggregation of transaction, integration, and external data

Interaction Data: Margaret is most active on Twitter between 2 and 4pm. The IP address corresponds to the home adress info she self declared when entering delivery info to the neighborhood she lives on. Margaret opens all Fashion Clothing Store emails she receives but only clicks on ads for children’s clothing. Though Margaret clicks on the ads, opens the emails, and spends an average of 14 minutes on the website every two weeks, she rarely makes any purchases on fashionclothingworld.com.

Transaction Data: Margaret spends an average of 250$/month on in-store purchases at Fashion Clothing Store. She mostly buys children’s clothing, ages 5-8. All of Margaret’s in-store purchases from Fashion Clothing Store happen in the boutique on 286 Lexington Street in Miami on weekdays between 9 and 11am.

External Data: 286 Lexington Street is located in a very rich neighborhood in Miami. The neighborhood consists of a majoritarily upper-class white population. The weather in this area of Miami is usually very warm with approximately 12 days of rain per year and average temperatures of 85 degrees. In the next week, chances of rain are at an unusual 90%.

By merging these 3 types of data, here is an example of what Fashion Clothing Store can infer about Margaret:

- Margaret most likely has children between the ages of 5 and 8;- Margaret spends a significant amount of money on her children’s clothing; - Margaret is upper-middle class compared to the average Miami population;- Margaret is probably a stay at home mom (based on her shopping and social media activity hours).

Fashion Clothing Store can use these inferences to facilitate their one-to-one relationship with Margaret. For example, Fashion Clothing Store could make sure that in the next week, Margaret sees an ad for an in-store sale on children’s raincoats in preparation for the upcoming storm.

Transaction Data(payment methods, purchase

order, time frequency...)

External Data(socio-demographic

intel, weather...)

Interaction Data(weblogs, social

media and activities...)

Page 13: ADVANCED ANALYTICS, THE MODERN …...2016 Dataiku, Inc. contactdataiku.com dataiku 1 ADVANCED ANALYTICS, THE MODERN MARKETER’S BEST FRIEND Model-Based Segmentation for2 2016 Dataiku,

13©2016 Dataiku, Inc. | www.dataiku.com | [email protected] | @dataiku

«Going Beyond Google Analytics»by Uli Bethke, Sonra Co-founder - Dataiku Partner

“In a recent study, CEB (https://www.cebglobal.com) interviewed nearly 800 marketers at Fortune 1000 companies with some interesting results. Only 11% of their decisions are based on data. In a similar survey performed by eMarketer magazine more than 54% of marketers in retail admitted that they are either not aware of the concept of Big Data or struggle to apply it in practical terms.

These results are astonishing as marketing is at the forefront of the digital revolution. Some of the very first Big Data use cases such as clickstream analytics have been in marketing. What are the reasons that even the most advanced users of data analytics are struggling?

Like users in other fields, marketers put the cart before the horse when it comes to Big Data. Contrary to common belief amongst marketing execs, insights are not created by simply hiring a bunch of data scientists and throwing a heap of data at them. Data science is actually hard work. It requires a precise definition of the business problem at hand, the type of analysis to be performed, and the actions that will be taken based on the analysis. Paraphrasing Picasso: Big Data is useless it can only give you answers. So asking the right questions is key. From our experience this is the #1 reason why marketers fail to benefit from the Big Data revolution.

Marketers have created lots and lots of data siloes for themselves. They have Google Analytics or Omniture for web analytics, MailChimp or Bronto for e-mail campaigns, all sorts of tools and channels for running ad campaigns, typically multiple CRM systems, tools for marketing automation. The list just goes on and on. This makes it difficult and costly to get a 360 degree view on what is going on with the customer. The free version of Google Analytics does not even allow you to extract the data at the lowest level of granularity, which makes it very difficult to build meaningful predictive models (tip: stay away from Google Analytics, at least the free version). At Sonra we have recently helped Hostelworld, one of our customers, to overcome this obstacle by building a data lake that integrates data from web analytics, ad campaigns, and various other channels. Hostelworld is now able to build predictive models using Dataiku DSS on top of the integrated data.

While marketers are often familiar with standard business intelligence and reporting use cases, they struggle to understand advanced analytics. Traditional aggregate analytics helps us to measure and compare performance of segments in our data, e.g. the PPC channel outperformed SEO conversions. This is useful to identify that a problem exists. However, it does not tell us why a problem exists or how it can be resolved. Having said this, some of this simple type of analysis is still actionable. You can take an action without knowing why a problem exists. In the following example, which course of action would you take? As a decision maker I let (1) all of my under-performing employees go or (2) try and understand why certain segments of my workforce under-perform and then address the underlying reasons. The first addresses the symptom, the second gets to the root cause of the problem.”

Page 14: ADVANCED ANALYTICS, THE MODERN …...2016 Dataiku, Inc. contactdataiku.com dataiku 1 ADVANCED ANALYTICS, THE MODERN MARKETER’S BEST FRIEND Model-Based Segmentation for2 2016 Dataiku,

14 ©2016 Dataiku, Inc. | www.dataiku.com | [email protected] | @dataiku

B - M A C H I N E L E A R N I N G : L E T ’ S M A K ES O M E P R E D I C T I O N S

Model-Based Segmentation_

Machine learning is a combination of mathematics, statistics, and computer science that aims to make predictions based on patterns discovered in data. Machine learning can help marketing teams predict what customers are likely to do. This is possible due to a granular segmentation approach — using detailed datasets to determine what specific actions customers will likely perform. We’re no longer reaching conclusions based on broad standardized queries (e.g., customer income and age); we are using machine learning to reach predictive conclusions based on specific behavioral queries (e.g., when will a customer visit a store given her current social media engagement on her cellphone?).

What does this mean for Customer Segmentation?

1 - A/B Testing: Less Room for Rule of Thumb Decision Making

As opposed to rule-based decision systems, which follow an explicit set of instructions, machine learning methodologies enable people to question the unknown. Machine learning involves algorithms. Computers apply these algorithms to more or less large volumes of data that cannot be analysed by humans themselves in order to discover patterns. This same logic can be applied to A/B testing (a method consisting of comparing two versions of a single variable by testing a customer’s response to variable A against variable B). Thanks to machine learning techniques, hundreds of variables can be tested in parallel thus making empirical experimentation a key strategy for marketing teams. Observed features then automatically lead to dynamic and granular segments of user and / or customer profiles that could have never been computed by humans alone.

Before, the marketing teams at Fashion Clothing Store could only test a few hand picked variables against each other. Today, their new data scientist is able to test hundreds of variables against each other for best results.

As the tests reveal significant advantages of some variables above others, the models are optimized and automatically calibrate segments which the marketing team can easily use to reorient campaigns accordingly.

Page 15: ADVANCED ANALYTICS, THE MODERN …...2016 Dataiku, Inc. contactdataiku.com dataiku 1 ADVANCED ANALYTICS, THE MODERN MARKETER’S BEST FRIEND Model-Based Segmentation for2 2016 Dataiku,

15©2016 Dataiku, Inc. | www.dataiku.com | [email protected] | @dataiku

2 - Benefits of Dynamic Data Teams

3 - Aligning Segmentation with Business Objectives

M O D E L - B A S E D S E G M E N TAT I O N : P U T T I N G M A R K E T E R S I N CO N T R O L

Historically, segmented data that marketing team could access was updated at a glacial speed: customer data was updated and made available on a monthly or yearly basis, resulting in outdated information. In today’s world, real-time data allows companies to run and re-run predictive models to fresh data in order to fine-tune the results. As customer data is being refreshed more frequently, it makes sense for companies to increase frequency of segmentation recalibrations as well. The increase in volume and accessibility of Big Data has transformed the machine learning landscape. Traditional segmentation methodologies will increasingly struggle to keep up, only to be replaced by model-based segmentation techniques capable of handling the larger volumes of data and including a much wider range and scope of information in real or near real-time.

Traditional consumer segmentation is a long, arduous, and time-consuming process. Large general market groups must be separated into smaller groupings based on shared variables. Customer segmentation eventually produces a true customer profile, but what version of the truth does it represent? The fact is that business issues change and evolve rapidly, but the process for understanding them takes too long with a traditional approach to segmentation.

With advanced analytics made possible by technological advances, the scene is dramatically different. Here’s a simplified example of what that could look like:

Marketing in today’s competitive landscape is a much different environment than it was a mere 10 years ago. In this Age of the Customer, technology has finally advanced to the point where customer input actually makes a difference… whether it’s requested or not! Traditionally, marketing decisions are influenced by customer feedback. Companies cross their fingers and hope that customers will let them know about their experience and that this feedback will be useful in improving operations.By leveraging the huge amount of available data with advanced analytics, marketers do not need to wait solely on customer feedback anymore. If they want to make informed decisions to optimize strategy, they need to collect their sources of User Data (i.e., transaction, interaction, and external) and harness the power of machine learning.

Traditionally, Fashion Clothing Store’s attempt at segmentation was a long drawn-out process that their marketing analysts didn’t expect to replicate more than every few months - or according to specific campaign or specific products to be pushed.

Today, data is refreshed on an hourly basis thus helping models evolve and adapt to new information for optimized segmentation at all times enabling the marketing team to scale their marketing efforts up & out.

Fashion Clothing Store has a huge stock of blue t-shirts it would like to sell. With advanced analytics and machine learning, Fashion Clothing Store is going to learn information about all the people in the past who have purchased these blue t-shirts (or a similar category of clothing such as pink t-shirts or blue coats). From this analysis, it turns out that in this cluster of people who have purchased blue t-shirts (or similar clothing), 3 common traits are sur-represented:

• Sur-representation of age group according to gender: there are a lot of females that are around 37 years old.• Sur-representation of another product purchase: there are a lot of people who have also bought blue shoes and red headband.• Sur-representation of 1st name: there are a lot of “Maggy’s” and “Angelina’s” in this cluster.

Therefore, based on this information, Fashion Clothing Store can compute a segment of customers that share the above traits. By pushing a campaign that advertises the blue t-shirts to this model-based segment of their customer database, they can statistically expect better than average results. For example, a woman named Maggy who is around 37 years old or a person who has purchased blue shoes and red headbands represent a high probability of also buying the blue t-shirt. This is an example of how predictive analytics empowers marketing teams to easily optimize their segmentation with a business objective in a matter of minutes instead of weeks.

Page 16: ADVANCED ANALYTICS, THE MODERN …...2016 Dataiku, Inc. contactdataiku.com dataiku 1 ADVANCED ANALYTICS, THE MODERN MARKETER’S BEST FRIEND Model-Based Segmentation for2 2016 Dataiku,

16 ©2016 Dataiku, Inc. | www.dataiku.com | [email protected] | @dataiku

«A Story of Segmentation»by Charlotte Pierron-Perlès, Lead of Big Data & Analytics - Dataiku Partner

PRESENTATION OF OUR EXPERTISE

Customer segmentations have always been a highly strategic matter for all marketers. At Capgemini Consulting Data Farm, we have been helping many clients from all industries on such projects for several years. We help them rethink their segmentation strategy and deliver actionable segmentations in a dynamic fashion, from Proof of Concept to industrialization. Our mixed team of 30 experts, half Data Scientists and half Data consultants, aims at both structuring and delivering such projects.

CONTEXT & BUSINESS STAKES

One of our first data science projects was around a customer segmentation. The client, a large insurance company, had realized it needed to go beyond its current CRM data to understand its individual customers. This is a common issue for insurers as customers have infrequent interactions with their insurance providers, compared to banks or retailers for instance. In particular, our client wanted to build a new segmentation based on the probability of its customers to retire in the coming years. This would help better accompany these customers in this important life event, with significant financial consequences.

DATA SCIENCE APPROACH

Thus we started with a simple goal: predict if a customer will retire in the coming 1-2 years. Our first finding was that CRM data alone could not solve this problem, as occupational data on retirement was not 100% accurate. We solved this challenge by crossing several internal databases along with Open Data, in particular INSEE census data. This enabled us to greatly enrich the internal datasets to get a higher predictive score. We worked in close relationship with the clients’ business owners to interpret data and identify relevant predictive clues.

BUSINESS RESULTS

After several iterations to improve our predictive score, we managed to identify clear customer segments ranked by their probability of retiring. This allowed us to go from a total scope of ~2 million customers to only ~100 000 with at least a 50% chance of retiring in the next year. For our client this was a game changer to manage its marketing campaigns: instead of randomly contacting a broad customer segment solely based on age, it could now focus on a much smaller qualified target. In addition to higher customer satisfaction, the estimated savings in terms of telemarketing costs alone ranged from 650k€ to 1m€.

Page 17: ADVANCED ANALYTICS, THE MODERN …...2016 Dataiku, Inc. contactdataiku.com dataiku 1 ADVANCED ANALYTICS, THE MODERN MARKETER’S BEST FRIEND Model-Based Segmentation for2 2016 Dataiku,

17©2016 Dataiku, Inc. | www.dataiku.com | [email protected] | @dataiku

In machine learning and statistics, classification refers to connecting a new observation to a set of categories (or populations) based on the train dataset from which we already know the outcome. For example, within our customer database, we'll assign two categories: "has bought blue t-shirt" and "has not bought blue t-shirt". Our model will "learn" the common characteristics of the "has bought blue t-shirt" and "has not bought blue t-shirt" groups. From this information, the model will be able to give a probability score of "will buy blue t-shirt" and "will not buy blue t-shirt" to all other customers for whom we do not know the outcome based on similar characteristics to either group. To test the model's accuracy, we will run it on the test dataset but pretend we don't know the "has bought blue t-shirt" and "has not bought blue t-shirt" outcome. If our model is good, we'll be able to accurately match our population with the real outcome of "has bought blue t-shirt" and "has not bought blue t-shirt". Make sense? Great!

An algorithm is simply a procedure or formula that a computer uses to solve a problem. There’s nothing necessarily mystifying about algorithms, though saying “algorithm” may command more acclaim from your peers than a lowly “formula.” So let’s take a moment to discuss the algorithms that power Machine Learning and how marketing can use them.

Like choosing a topping for your pizza, there are a wide variety of algorithms available. Each has its own pros & cons depending on what you are trying to accomplish. A key question when it comes to the marketing spectrum of algorithm use is how customer features will relate to each other. For example, if customers with the highest income are more likely to purchase expensive clothes, can we conclude that buying decisions are linked to purchasing power? Yes. But how do you detect when a combination of several variables has important impact on the outcome? Indeed, the correlation or lack thereof between behavioral features can be quite complex. Certain algorithms, however, can handle these data connections better than others.

Broadly-speaking, algorithms are categorized into three different groupings: Linear Models, Tree-Based Models, and Deep Learning.

Linear Model Approach

A linear model uses simple formulas to compute a score for each customer based on individual attributes (for example: age, gender, location, etc.). If you want to predict a binary outcome (will churn? yes or no), you will typically use a variant called logistic regression.

L E T ' S G E T T E C H N I C A L :A L G O R I T H M S A R E Y O U R F R I E N D S _

Model Segmentation in The Real Word_

Logistic Regression

Logistic regression is used to find the probability of an event being a success or a failure. To put it simply, these are binary (0 or 1) events where two choices are available, such as “Will Buy” vs. “Will Not Buy.” This type of linear algorithm is widely used for classification problems.

Page 18: ADVANCED ANALYTICS, THE MODERN …...2016 Dataiku, Inc. contactdataiku.com dataiku 1 ADVANCED ANALYTICS, THE MODERN MARKETER’S BEST FRIEND Model-Based Segmentation for2 2016 Dataiku,

18 ©2016 Dataiku, Inc. | www.dataiku.com | [email protected] | @dataiku

Tree-Based Model Approach

When you hear tree-based, think decision tree, i.e., a sequence of branching operations. For example, say you want to have wider insights on who opened/didn’t open your latest newsletter: tree-based models will, step-by-step, isolate and determine features of people not opening your newsletters (sex, then age, then purchasing power, then…).

Decision Tree

Here, the combination of attributes is based on a concept called “information gain,” which is a measure that tells us how much a given attribute can differentiate between different outcomes. For example: at Fashion Clothing Store, splitting the customer database between those who spend more than $200/month and those who spend less, gives us a “high” information gain. A good decision tree combines multiple conditions to create different branches, with each branch representing a segment of the population (e.g., a branch could consist of people who spend more than $200 a month + are less than 45 years of age + bought a blue t-shirt). Each branch is then labeled as either "leads that will convert" or "leads that are unlikely to convert" — this process is called training the decision tree.

Random Forest

A random forest classifier is basically a set of decision trees that are used to harvest the collective intelligence of multiple decision trees. In layman’s terms, a random forest weighs all attributes and intelligently selects the most important for the problem at hand. This technique provides a great lift in performance as each decision tree can specialize on a specific segment of factors and use confidence to outweigh the other decision trees involved.

Gradient Boosting

A gradient boost method uses the same basis as the random forest method, but follows 'weak' learners (i.e., trees that perform relatively poorly). In terms of decision trees, weak learners are shallow trees, sometimes even as small as decision stumps (trees with two leaves). Boosting reduces error mainly by reducing bias by aggregating the output from many models.

spend > 200$

>18 years

spend < 200$

10% will buy35% will buy60% will buy80% will buy

<18 years not in California in California

Page 19: ADVANCED ANALYTICS, THE MODERN …...2016 Dataiku, Inc. contactdataiku.com dataiku 1 ADVANCED ANALYTICS, THE MODERN MARKETER’S BEST FRIEND Model-Based Segmentation for2 2016 Dataiku,

19©2016 Dataiku, Inc. | www.dataiku.com | [email protected] | @dataiku

Deep Learning Approach

Deep learning consists of algorithms that are composed of hidden layers of multiple artificial neural networks. Originally, neural networks refered to a biological phenomenon comprised of interconnected neurons that exchange messages with each other. This idea has now been adapted to the world of machine learning and is called ANN (artificial neural network).

ANNs are a family of models that are taught to adopt cognitive skills to function like the human brain. Think Artificial Intelligence! Image recognition, voice recognition, soft sensors, anomaly detection, and time series predictions are all ANN-based applications. Marketers out there, next time you hear someone talking about AI, just say “Oh ya, neural networks you mean?”. Everyone will be impressed… or they might never speak to you again.

So basically, deep learning enables people on the advanced analytics process to go deeper into the network without having to create extraneous code, unlike ‘shallow’ algorithms (e.g., Decision Trees). Deep learning techniques are used for SEO optimization, to improve advertising efficiency, to improve filters on the posts that you see online, and so on. As you go deeper, complex features can be combined with previous layers, thus providing better results. And don’t you worry, all this may sound scary, but computers do the computations for you!

Page 20: ADVANCED ANALYTICS, THE MODERN …...2016 Dataiku, Inc. contactdataiku.com dataiku 1 ADVANCED ANALYTICS, THE MODERN MARKETER’S BEST FRIEND Model-Based Segmentation for2 2016 Dataiku,

20 ©2016 Dataiku, Inc. | www.dataiku.com | [email protected] | @dataiku

1- Evaluating the Value of an Algorithm

Precision tells us what proportion of customers whom we thought would be buyers actually bought. Ex. Our model predicted that 20% of our segment would buy the blue t-shirt. 100% of these predicted 20% did buy. Therefore, the precision of our model is perfect (this rarely happens).

Recall tells us what proportion of customers that actually bought were expected by us to be buyers. Ex. My model predicts that a particular segment of people will buy the blue t-shirt. All my predictions are correct - these people do buy the t-shirt. However, my model did not include another 1000 people that also proceeded to buy the blue t-shirt. Therefore, though my precision is perfect, my recall is poor.

So basically, recall and precision give us information about our models effectiveness (ie. how many did we miss and how many did we accurately predict?). The F1-score sums up this numerical relationship. Here’s a closer look:

TN (i.e., the model output predicts that a customer will not buy. And indeed, the customer does not buy). Charlotte was a TN: the model had predicted that she would not buy the blue t-shirt and indeed, she did not.

Precision can be defined as the proportion of True Positives (TP) in the set of positive buyer predictions. In the confusion matrix, this is represented by the rightmost column. The formula to determine Precision is: (TP) / (TP+FP).

Recall can be described as the proportion of TP in the set of true buyers. In the confusion matrix, this is represented by the bottom row. The formula to determine Recall is: (TP) / (TP+FN).

TN (TRUE NEGATIVE)

CORR

ECT

CELL

SER

ROR

CELL

S

TP (TRUE POSITIVES)

FN (FALSE NEGATIVES)

TP (i.e., customers who did buy whom we correctly determined to be buyers). Oliver was a TP: the model had predicted that Oliver would buy the blue t-shirt and indeed, he did. (The marketing department loves TPs!).

FN (i.e., customers who bought, but whom we incorrectly determined to be non-buyers). Jane was an FN: the model had predicted that she was not part of the potential buyer segment corresponding to the blue shirt. However, Jane actually did buy the blue t-shirt. Therefore, if there are many more FNs, the model could use improvement.

FP (i.e., customers who did not buy, but whom we incorrectly determined to be buyers). Jack is a FP: the model had predicted that he would buy the blue t-shirt. However, Jack did not buy it. Once again, if there are too many FPs like Jack, the model definitely needs improvement.

About Precision

About Recall

REALITY

Doesn't buy

Buys FN

TN

TP

FP

EXPECTATIONSWill Not Buy

(not targeted)Will Buy

(targeted)

Precision vs. Recall

Now that we've seen how many choices we have, let's discover how to evaluate which approach is the best. Two classical indicators used to determine efficiency are recall and precision, both of which are used to create what is called an F1-score. But before you do anything, it is essential to isolate a test dataset. Simply put, a test dataset is a database in which we already know an outcome. For example, in historical data, I already know which customers have indeed churned. I am going to use a part of this passed data to train my model. I will use the other part of this data (a.k.a. the test dataset) to evaluate my model's accuracy.

FP (FALSE POSITIVES)

Model Segmentation in The Real Word_

L E T ' S G E T T E C H N I C A L :R E A D Y , S E T. . . T A R G E T ! _

Page 21: ADVANCED ANALYTICS, THE MODERN …...2016 Dataiku, Inc. contactdataiku.com dataiku 1 ADVANCED ANALYTICS, THE MODERN MARKETER’S BEST FRIEND Model-Based Segmentation for2 2016 Dataiku,

21©2016 Dataiku, Inc. | www.dataiku.com | [email protected] | @dataiku

Will Not Buy(not targeted)

Will Buy(targeted)

No Benefits, No Cost Error, Cost

No Benefits, No Cost Gain

Doesn't buy

Buys

This is fine & dandy in the world of mathematics, but we are using algorithms for business and marketing purposes. The point here is to not necessarily look for the best precision and recall scores but, rather, for the highest return outcomes. Therefore, to better express business objectives, we’ll use a Cost / Margin Matrix.

Expectations

Reality

The above Cost / Margin Matrix is quite different from the confusion matrix we explored previously. When a targeted client makes a purchase (true positive, a.k.a. Oliver), there is benefit; if they do not buy, there is some cost (false positive, a.k.a. Jack) - ie. potentially spamming a non buyer, cost of emailing, of push advertising or telemarketing, etc. If no targeting is done, there is no benefit and no cost.

2- The Lift Chart: Effective Customer Targetting

Lift charts (otherwise known as gain curves - like the one you will find below) are commonly used for push marketing campaigns. But what do they mean?This chart represents the ratio between results obtained with and without the classification model and therefore is a great indicator of the model's effectiveness. Unlike the matrix we saw previously, the lift chart evaluates the model's performance on a portion of the population. This makes it especially useful for "targeting" use cases, where you might not want to contact the whole user population.The lift chart enables you to measure how much more likely you are to correctly predict who "has bought a blue t-shirt" compared to a random guess. On the above example, by contacting the 40% of the population for which my predictive model indicates the highest probability of buying, I'll reach 77% of my would-be buyers, which is 37% better than a random targeting.

Voilà! By using this type of model to dynamically segment your user database, you will be able to generate the most value by pushing the right offer to the right customer.

Page 22: ADVANCED ANALYTICS, THE MODERN …...2016 Dataiku, Inc. contactdataiku.com dataiku 1 ADVANCED ANALYTICS, THE MODERN MARKETER’S BEST FRIEND Model-Based Segmentation for2 2016 Dataiku,

22 ©2016 Dataiku, Inc. | www.dataiku.com | [email protected] | @dataiku

«5 Guidelines for Data-driven Marketing»by Stéphane Marcovitch-Bruneau, Manager - Dataiku Partner

As classical segmentation is being replaced by predictive analysis and machine learning, feedback is becoming the marketer’s most precious asset. Pre-established rules, no matter how insightful they are or seem to be, can only benefit from being continuously challenged and refined by looking at data. That’s the virtuous circle of data science: the more assumptions we make, and the faster we test them, the more we learn – and again it brings more assumptions, and the circle goes on.

But in order to achieve this kind of data-driven momentum, teams and organizations need more agility. Why? Because it’s all about feedback. In every team, there are people who know one subject-matter better than others, others who understand one customer category better, others who know one data source better… Whatever the question one is asking, it’s almost certain that another knows how and where to look for the answer.

Here are our top 5 rules and guidelines to generate agility for data-driven marketing projects:

1. Test and Learn Iteratively. Don’t get trapped in long debates and huge specifications of what an algorithm or model should look like. For example, if you want to build a product recommender system, don’t think ahead all the features you could add to improve its relevance. It will only make the test and learn more complex. Take one step at a time.

2. Make Daily Scrum Meetings. Organize daily 30 minutes meetings where each member of your team can tell others which assumption or question she/he is working on, and which data she/he is going to use to test it. If someone else has had findings on a related issue the day before, or has encountered a bug on the data source, it’s better to let her/him know.

3. Give your Data Scientists the Big Picture. Because they need the freedom to formulate and test new assumptions as they learn from previous ones, data scientists therefore need to know the big picture: not only that you want to predict churn from customer online behaviour, but also why. In addition, they need a sandbox environment where they can experiment freely without being blocked by performance or security constraints.

4. Cherish your Data Sources. Data sources are often viewed as letdowns: if only that field was always filled correctly, or refreshed in real-time... But they can also be underestimated gold mines. Keep occasional dayparts for unsupervised data discovery, without any question to answer. Also, always keep raw data archives in case you missed something, and be close to your data source guy – the one who knows every field and every bias of the providing system.

5. Challenge your View of the Customers. Make your data-driven use cases the reflect of your customers’ journey. Take each step of your customer experience, and think about how each data source – whether it’s in your data lake or not – might increase knowledge about what happens at this point, and which indicator you could optimize. You can use advanced tools for use case design (ex: Foreseeds), or the good old white board.

All of these guidelines aim for the same result: any team should put together people with business knowledge, customer knowledge and data knowledge, and make them have regular conversations about what they do. This, along with good tools and individual skills, is the third key to successful data-driven marketing.

Page 23: ADVANCED ANALYTICS, THE MODERN …...2016 Dataiku, Inc. contactdataiku.com dataiku 1 ADVANCED ANALYTICS, THE MODERN MARKETER’S BEST FRIEND Model-Based Segmentation for2 2016 Dataiku,

23©2016 Dataiku, Inc. | www.dataiku.com | [email protected] | @dataiku

M O D E L- B A S E D S E G M E N T A T I O N I N T H E R E A L W O R L D _A- How Can the IoT Industry Ensure Subscriber Retention and Loyalty ?

B- How Can an Online Retailer Reduce its Churn?

C- How Can Segmentation Allow for Targeted Recommendation in the Travel Industry?

D- How Can the Banking Industry Improve its Customers’ Segmentation?

☑ LET’S GET TECHNICAL: Churn, A Story About Love, or Lack Thereof

The benefits of model segmentation make perfect sense on paper, but how is model-based segmentation applied in the real world? Let’s take a look at some companies who have used machine learning applied to their data to transform their businesses while achieving genuine customer engagement.

Page 24: ADVANCED ANALYTICS, THE MODERN …...2016 Dataiku, Inc. contactdataiku.com dataiku 1 ADVANCED ANALYTICS, THE MODERN MARKETER’S BEST FRIEND Model-Based Segmentation for2 2016 Dataiku,

24 ©2016 Dataiku, Inc. | www.dataiku.com | [email protected] | @dataiku

A - H O W C A N T H E I O T I N D U S T R Y E N S U R E S U B S C R I B E R R E T E N T I O N A N D L O Y A LT Y ? _

1- Challenges: Securing the Subscriber Base with an Effective Loyalty Program

2- Solution: Behavioral Analysis to Optimize Use of Connected Devices

3 - Results: Outbound Call Campaigns +11% Efficiency

The more of its own data Coyote collects, the better its service. By improving retention rates, Coyote wishes to enhance the following virtuous circle: the more users are acquired, the better the service quality, and vice versa. Coyote wants to optimize its loyalty program in order to incite their customers to increase device use. For this, the company wants to find a technical solution that will enable them to:

Through its connected devices, Coyote collects extensive data on the different uses of its users, such as mileage, time spent on the road, or the number of alerts issued by the community members. In order to make sense of and clean all of this data, Coyote called upon Dataiku and Data Science Studio software (DSS).

With DSS, Coyote has built and implemented a predictive behavioral analysis application to segment customers. First, the application automatically compiles and processes heterogeneous and completely anonymized data (contractual data, customer declared data, real-time device data...). This data is then processed by a machine-learning algorithm to model user behavior. This model and its results were subsequently adjusted in order to optimize marketing campaigns. With this score, Coyote is now able to segment its user base with very high accuracy.

Thanks to this predictive behavioral analysis, Coyote optimizes marketing and sales campaigns based on its customer profiles. This application results in several advantages:

Increase the performance of outbound call campaigns: +11% efficiency;

Adapt marketing campaigns thanks to increased knowledge of the actual uses of the service;

Significantly improve data management.

Segment its customer base by user profile

Qualify incoming data

Quantify device use (anonymous data analysis)

Coyote is the French leader of real-time road information. Created in 2005, Coyote has 200 employees, generated a turnover of over € 100 million in 2014, and currently has over 4.8 million users in Europe. Coyote devices allow users to interconnect to the entire community in order to warn other drivers of different traffic hazards and traffic conditions (e.g., traffic obstructions, accidents, speed traps, etc.) while they are driving.

Model Segmentation in The Real Word_

Want to know more?

Download the entire use case here

Page 25: ADVANCED ANALYTICS, THE MODERN …...2016 Dataiku, Inc. contactdataiku.com dataiku 1 ADVANCED ANALYTICS, THE MODERN MARKETER’S BEST FRIEND Model-Based Segmentation for2 2016 Dataiku,

25©2016 Dataiku, Inc. | www.dataiku.com | [email protected] | @dataiku

B - H O W C A N A N O N L I N E R E T A I L E R R E D U C E I T S C H U R N ? _

Founded in 2006, Showroomprive.com is a leading e-commerce player with over 20 million members in Europe. The e-commerce site has about 15 flash sales and over 2 million visitors per day. It generated €605M in business volume in 2015, a 40% growth increase compared to 2013.

1- Challenges: Refining Client Qualification to Anticipate, Prevent and Reduce Churn Rates

In order to counter churn, Showroomprive uses static rules to trigger marketing actions. These rules are common to all customers and no prior qualifications are made to determine the value of each individual client. Showroomprive wants to counter churn and improve customer loyalty. To do this, the company wishes to:

2 - Solution: Detecting Churners on Individual Purchase Rates

Showroomprive uses DSS to develop a solution that predicts whether or not a buyer will return to the website to make a purchase. Thanks to DSS, all of the work that revolves around this solution is internalized – from R&D to production. Indeed, Showroomprive uses DSS to:

3 - Results: Detecting Potential Churners with 77% AccuracySince they’ve been running their DSS powered application, Showroomprive detects, amongst mono-buyers, potential churners with an AUC of 0.819!

Based on individual purchase rates, detect clients with a high potential of no longer buying from the website;

Refine targeting of marketing campaigns for each potential churner so as to improve customer loyalty.

Automate the integration and enrichment of a variety of data sources (customer data, order and delivery data, web logs…);

Create more than 690 features derived from this data depending on variables such as clicks on sales, orders, litigation, customers;

Test multiple machine learning algorithms to achieve the best predictive model.

Model Segmentation in The Real Word_

Want to know more?

Download the entire use case here

Page 26: ADVANCED ANALYTICS, THE MODERN …...2016 Dataiku, Inc. contactdataiku.com dataiku 1 ADVANCED ANALYTICS, THE MODERN MARKETER’S BEST FRIEND Model-Based Segmentation for2 2016 Dataiku,

26 ©2016 Dataiku, Inc. | www.dataiku.com | [email protected] | @dataiku

C - H O W C A N S E G M E N T A T I O N A L L O W F O R T A R G E T E D R E C O M M E N D A T I O N I N T H E T R A V E L I N D U S T R Y ? _

Model Segmentation in The Real Word_

Voyage Privé is the #1 exclusive members only travel club, offering their customers discounts of up to 70% off when booking 4 and 5 star hotels. Initially launched in France, the company now has over 25 million members worldwide with operations in Belgium and Switzerland, and offices in France, UK, Spain, Italy, and Poland. Voyage Privé uses advanced customer data analysis to offer personalized and highly-relevant travel recommendations to its members.

1- Challenges: Boosting Transaction Value & Improving Customer Satisfaction

Creating personalized offer displays poses a significant challenge to Voyage Privé. Given the company’s brand as a boutique vacation retailer, it was critical to offer travel options that were appropriate for their members. In terms of data analysis, this meant expanding the range of customer signals that could be captured and analyzed.

Voyage Privé required a software solution that could capture and make sense of large amounts of data, develop effective customer segmentation, and implement an entirely new non-rule-based approach for analyzing incoming and historical data. From a marketing standpoint, the end goal was to increase customer satisfaction by providing users with personalized offer selections while simultaneously boosting the total transaction value by customer.

2- Solution: Machine Learning to Score Customer’s Interest in Specific Offers

Voyage Privé took the first step towards understanding their customers by implementing Dataiku Data Science Studio (DSS). The strategy started with establishing a mechanism for collecting data from customers’ online behavior, such as click paths and bookmarking. With the collected data, the focus shifted to creating a machine learning-derived score for each customer — essentially a value that reflected the likelihood of members pursuing specific travel offers. The process of using Dataiku DSS empowered the company’s teams to collaboratively work together on specific types of data before merging it. Its drag & drop interface simplified data diagnostics while facilitating the iteration process.

Ultimately, Dataiku DSS helped the company’s IT teams to develop a machine learning approach to address customer data. This coupling of online behavioral data and tailored offer selections enabled Voyage Privé to automatically present relevant buying opportunities that had the highest likelihood of customer acceptance.

3 - Results: 6.5% Increase of Revenue per MemberArmed with Dataiku Data Science Studio and a machine learning analysis methodology, Voyage Privé can now optimize their marketing & sales campaigns based on a precise customer segmentation. The entire process has resulted in several competitive advantages, such as:

A 6.5% increase in the total transaction value by unit member;

The complete internalization of the company’s data workforce.

Want to know more?

Download the entire use case here

Page 27: ADVANCED ANALYTICS, THE MODERN …...2016 Dataiku, Inc. contactdataiku.com dataiku 1 ADVANCED ANALYTICS, THE MODERN MARKETER’S BEST FRIEND Model-Based Segmentation for2 2016 Dataiku,

27©2016 Dataiku, Inc. | www.dataiku.com | [email protected] | @dataiku

D - H O W C A N T H E B A N K I N G I N D U S T R Y I M P R O V E I T S C U S T O M E R S ’ S E G M E N T A T I O N ? _

The banking and finance sector represents the intersection between people and money, an ideal environment for machine learning technology to make a profound difference. This was exactly the case when Dataiku’s Data Science Studio (DSS) was able to assist a large European banking institution in order to improve their marketing segmentation initiatives. This particular bank is one of the largest banks in the world and is truly a global player in terms of providing comprehensive retail and corporate banking services.

2- Solution: Linking Weblog Data with Transaction Data

3- Result: 14% Decrease in Customer Churn

By the project’s end, the banking institution was able to deliver new model-based segmentations based on their customers’ behaviors and interactions. Thanks to Dataiku, the bank was able to create 27 different segments based on product & service usage. In addition, a new churn scoring system empowered Dataiku’s banking client to score their customers and tackle the problem of churn, with the end result being a 14% decrease in their churn rate.

The goal, then, was to make sense of vast amounts of data while creating a consistent data technology environment. The introduction of Data Science Studio made a profound difference, as it enabled the company to move away from SAS (which could not handle the quantity of raw data) and implement a solution that used advanced Big Data analytics to facilitate data processing. DSS is capable of handling not only large quantities of data, but is also able to collect it from multiple disparate sources — a feature that was invaluable considering the combinations of transaction, interaction, and external data involved.

After data collection, DSS then had to make sense of the data. This meant cleaning and parsing vast weblogs, with the end result being formatted datasets that were prepared for machine learning analysis. This analytical process then linked together the weblog data with transactional data, enabling DSS to effectively create different customer personas that shared common habits. The ingredients were now in-place for model-based segmentation.

First, however, the collaboration challenge needed to be addressed. DSS did this by providing a data lake environment that enabled users with different skill-sets to work together on analysis. Employees were able to improve their burgeoning data science skills while being able to visualize their work in a shared setting. So, after all the pieces were in place, what happened?

1- Challenges: Making Sense Out of Too Much DataDataiku’s client wanted to use its existing datasets, such as credit card transactions and Web logs, to improve its marketing segmentation efforts. Their analysts needed a tool that enabled them to work autonomously in a data lake environment while being able to visualize their efforts. While the goal was defined, the process of creating and implementing a plan to achieve this remained elusive: there was simply too much data and too many different data technologies.

Model Segmentation in The Real Word_

Want to know more?

Download the entire use case here

Page 28: ADVANCED ANALYTICS, THE MODERN …...2016 Dataiku, Inc. contactdataiku.com dataiku 1 ADVANCED ANALYTICS, THE MODERN MARKETER’S BEST FRIEND Model-Based Segmentation for2 2016 Dataiku,

28 ©2016 Dataiku, Inc. | www.dataiku.com | [email protected] | @dataiku

Customer churn, or attrition, measures the number of clients who discontinue a service (e.g., cellphone plan, bank account, SaaS application, etc.) or stop buying products from a company in a given time period. Churn rate is an important business metric as it reflects customer response to service, pricing, and competition. Furthermore, acquiring new customers is from 6 to 7 times more costly than retaining existing ones. As such, measuring churn, understanding its underlying causes, and being able to manage risks associated with customer churn are key areas for both short and long term business success.

Let’s Get Technical: Churn, A Story About Love, or Lack Thereof

The best methodology is to combine both short-term actions (in order to retain potential churners) with long-term approaches (in order to have an effective and sustainable impact on churn reduction).

Realistically, this means implementing the following steps:

Develop a model based on machine learning techniques to analyze performances that will enable short-term actions. The model will learn from past data and be able to predict which customers are likely to churn. The model thus rates each and every one of your customers based on probability to churn. Therefore, you may have a resulting segment made up of all the customers who have a score of 90% or above probability of churning. Thanks to this segment, your marketing team can attempt to retain probable churners with specific campaigns (special offers, discounts, personalized suggestions, free in game money, coupons, etc.)

Develop a model to understand the reasons for churn. This deeper knowledge allows marketing teams to attack the root of the problem and to understand how to reduce churn on the long term. For example, in some cases, it could be useful to engage in a long term study to optimize the purchasing funnel.

Analyzing all historical data on real past churners to detect trends and common features in this segment of your customer database.

Applying these findings to incoming data to compute a probability of churn score to each customer.

Machine Learning Model (short-term action)

Analytical Model (long-term study)

STEP 1

STEP 2

To predict churn, we can use two complementary modeling approaches:

Page 29: ADVANCED ANALYTICS, THE MODERN …...2016 Dataiku, Inc. contactdataiku.com dataiku 1 ADVANCED ANALYTICS, THE MODERN MARKETER’S BEST FRIEND Model-Based Segmentation for2 2016 Dataiku,

29©2016 Dataiku, Inc. | www.dataiku.com | [email protected] | @dataiku

MAKING A REAL DIFFERENCE“Listen to your customers” is an often cited nugget of advice meant to inspire companies to take their customers’ views into consideration. While the phrase is well-intentioned, for many years large-scale implementation was only a dream — companies would have loved to integrate customer feedback into their marketing efforts, but how was it possible when they did not truly understand who their customers were? Buying habits, social media usage, product & service preferences, working hours, travel habits, and trend engagement are all data dimensions that add deep meaning to customer insight. However, getting this kind of data from a paper-based survey form was not realistic.

Dataiku’s platform brings the power of advanced analytics to marketing teams thus enabling them to seize the opportunities that arise from applying machine learning to their user data. That’s right - intelligent and dynamic customer segmentation is no longer unattainable to the average marketing teams that wish to increase customer engagement with products & services. Let’s find out how you too can use model-based segmentation to optimize your marketing campaigns.

Selecting the cut-off of potential churners to target based on probability of churn (do you want to send a campaign to all those that represent 70% and above probability of churn or to those that represent 90% and above probability of churn?).

Optimize & Take Action!

Optimize your potential churner segment! How? Let’s say that out of the calculated segment that represents a 90% and above probability of churning:

• 20% spend over 500$;• 50% spend between 500$ and 50$ per month;• and 30% spend less than 50$ per month.

Therefore, it is probably more valuable for the company to focus on the 20% that spend over 500$ per month than on those who spend less than 50$. Campaigns could be adapted for each segment based on potential cost vs. return.

For example, a marketing team could sub-segment this churn segment and do the following:

• For those 20% that spend over 500$ => a campaign offering “70% on any purchase above 100$”

• For those 50% that spend between 500$ and 50$ per month => a campaign offering “buy one, get the second half off”

• For the 30% that spend less than 50$ per month => a campaign that offers “10% discount on all purchases above 30$”

STEP 3

STEP 4

Page 30: ADVANCED ANALYTICS, THE MODERN …...2016 Dataiku, Inc. contactdataiku.com dataiku 1 ADVANCED ANALYTICS, THE MODERN MARKETER’S BEST FRIEND Model-Based Segmentation for2 2016 Dataiku,

30 ©2016 Dataiku, Inc. | www.dataiku.com | [email protected] | @dataiku

«Churn Case Study»by Michael Lang, Webbmason Co-cofunder - Dataiku Partner

We recently worked with a professional accreditation company offering training and certification, who depended on an annual membership program for a large share of their revenue. They recognized, based on their annual revenue report, that their membership program was growing, but not at a desired pace. They concluded that they were doing well in member acquisition, but they had a churn problem that was impacting their bottom line.

In order to quantify the problem, we needed to integrate data from their CRM (which contains membership renewal records) with data from their data warehouse (which included financial data on memberships). For this, we leveraged our Analytics Platform. This environment allows us to develop analytic solutions supported by a technology stack that allows us to quickly pull data from multiple sources, cleanse and integrate the data, perform analytic transformations, and visualize the results with self-service data visualizations. Through our analysis, we were able to determine the following: • New Member Acquisition was increasing by 15% year over year, however, customer churn was increasing by 13% year over year • In 2014, our client lost over $15 million as a result in churn • First year members churned at a rate of 55%, however, members who renewed their membership just once had a churn rate in the second year of 29%. Armed with this data, we were able to establish a baseline and a cost-benefit analysis. Just a 10% reduction in churn, which is a very achievable goal, would save our client $1.5 million a year. After this initial analysis, we moved on to a more advanced data science project to develop a model that would allow them to: • Identify the key signals that a customer was going to churn • Score every existing customer based on their likelihood to churn • Leverage this insight to develop an effective marketing campaign to reduce overall customer churn

The results of this model enabled our customer to identify existing customers that had a high, medium, and low probability of churn, and identify the signals that predict who will churn. While these insights are powerful, you cannot improve the current problem without translating these insights into action.

To develop an effective action plan, we returned to the results of our data science project, most notably: • Getting a first year member to renew their membership once decreases their probability of churn from 55% to 29% • People who purchase more online have a significant lower churn rate than those who do not purchase products online • Certain members will have a very low probability of churn (they will renew without any marketing outreach) while certain members have a very high probability of churn (they will not renew even with marketing outreach). A significant portion of members had a medium probability of churn and therefore, may be influenced from marketing outreach Armed with this information, we were able to help our client transform their renewal marketing campaign from the current process into a campaign driven by insights.

Page 31: ADVANCED ANALYTICS, THE MODERN …...2016 Dataiku, Inc. contactdataiku.com dataiku 1 ADVANCED ANALYTICS, THE MODERN MARKETER’S BEST FRIEND Model-Based Segmentation for2 2016 Dataiku,

31©2016 Dataiku, Inc. | www.dataiku.com | [email protected] | @dataiku

The ultimate dream of a marketer? The 1-to-1 relationship with each and every customer. Knowing exactly what they like, don’t like, and how much they are willing to pay, among other things. This kind of information is a gold mine. It provides vast customization options to you in terms of product branding, services packaging, prices, and delivery. If accurately used, your return on investment would skyrocket, as you would attain maximum benefit from your costs.

Years ago this kind of customer knowledge was the stuff of dreams. The best a marketer could hope for was to create a product, put it on the market, and hope for the best. The relationship was asynchronous: feedback was severely limited, irrelevant, ineffective, and too hopelessly outdated. The lack of real feedback data resulted in the rise of intangible factors: a marketer’s “gut instinct” or his sheer knowledge of limited-scope markets became the driving forces. The product of this blind marketing approach is small tenuous markets that have little impact on economies of scale.

While traditional segmentation is doomed to be relegated to the dustbin of history, advanced analytics is arising to take its place. New analytical capabilities are fueled by machine learning technologies capable of sorting through vast amounts of raw, unfiltered data. This data is cleansed, formatted, modelled, and visualized until it produces predictive insights applicable to real-world marketing challenges… and actionable by marketers themselves. Customer data is no longer guesswork: datasets from a wide variety of dimensions can be analyzed in order to understand exactly how your customers engage with your products or services.

Advanced analytics now offers the concrete tools that you need to gain insights and leverage customer data.

The real questions are: how will you utilize these capabilities? What business cases do you need to solve? Which tools will you use?

1: “Using customer analytics to boost corporate performance”, McKinsey & Company, January 2014, 6.2: “Lead the customer-obsessed transformation”, Forrester.com, retrieved 25 April 2016 from https://www.forrester.com/age-of-the-customer/-/E-MPL291.

C O N C L U S I O N

Page 32: ADVANCED ANALYTICS, THE MODERN …...2016 Dataiku, Inc. contactdataiku.com dataiku 1 ADVANCED ANALYTICS, THE MODERN MARKETER’S BEST FRIEND Model-Based Segmentation for2 2016 Dataiku,

32 ©2016 Dataiku, Inc. | www.dataiku.com | [email protected] | @dataiku

A B O U T D A T A I K U

A B O U T D A T A I K U D S S ( DATA S C I E N C E S T U D I O )

Dataiku strives to be the acknowledged advanced analytics leader and preferred software solution in helping organizations succeed in the world’s rapidly evolving data-driven business ecosystem. Guided by the belief that true innovation comes from the effective combination of diversity of cultures, of mindsets, and of technologies, Dataiku’s purpose is to enable all enterprises to imagine and deliver the data innovations of tomorrow.

Dataiku DSS is a collaborative data science software platform that enables teams to explore, prototype, build, and deliver their own data products more efficiently. It is an open platform designed to accommodate rapidly evolving programming languages, big data storage and management technologies and machine learning techniques, and is conceived to accommodate the needs and preferences of both beginning analysts and expert data scientists. It also uniquely support:

CollaborationCollaboration features make it easy to work as a team on ambitious data projects, to share knowledge amongst team members and to onboard new users much faster. You can add documentation, information or comments on all DSS objects.

ReproducibilityEvery action in the system is versioned and logged through an integrated Git repository. Follow each action from the timeline in the interface, with easy rollback to previous versions.

Production DeploymentDSS lets you package a whole workflow as a single deployable and reproducible package. Automate your deployments as part of a larger production strategy. Run all your data scenarios using our REST API.

Governance and SecurityDSS helps you create clearly defined projects and make sure your data is organized. And with fine grained access rights, your data is available only to the right persons.

Try Dataiku DSS for free by visiting www.dataiku.com/try