Chang liu insight 2014

Post on 09-Aug-2015

24 views 0 download

Tags:

Transcript of Chang liu insight 2014

True Fit Skin Care

Chang Liu

Fellow at Insight Data Science 2014

So many products…

What makes it so hard? Overwhelming information

So many products… So many reviews…

What makes it so hard? Overwhelming information

So many products… So many reviews…

What makes it so hard? Overwhelming information

Reviews can be so long…

So many products… So many reviews…

What makes it so hard? Overwhelming information

Reviews can be so long…

So many ingredients…

So many products… So many reviews…

Time spent

Moneywasted

Happiness

What makes it so hard? Overwhelming information

Reviews can be so long…

So many ingredients…

32k Reviewers• w/ 2+ reviews

~1200 Products• ~80 brands• 8 categories

184k Reviews• Rating [1-5]• Review text• Quick take

Collaborative Filter using User Reviews from Sephora.com

Product

X Y …

Reviewers

1 …

2 …

3 …

… …

… …

N …

Algorithm: • Item-centric collaborative filter• Pearson’s correlation coefficients

to measure pairwise similarity

32k Reviewers• w/ 2+ reviews

~1200 Products• ~80 brands• 8 categories

184k Reviews• Rating [1-5]• Review text• Quick take

Collaborative Filter using User Reviews from Sephora.com

Product

X Y …

Reviewers

1 …

2 …

3 …

… …

… …

N …

Algorithm: • Item-centric collaborative filter• Pearson’s correlation coefficients

to measure pairwise similarity

32k Reviewers• w/ 2+ reviews

~1200 Products• ~80 brands• 8 categories

184k Reviews• Rating [1-5]• Review text• Quick take

Collaborative Filter using User Reviews from Sephora.com

Product

X Y …

Reviewers

1 …

2 …

3 …

… …

… …

N …

Algorithm: • Item-centric collaborative filter• Pearson’s correlation coefficients

to measure pairwise similarity

Cross Validation• 5-fold for reviewer• Leave-one-out for product• Accuracy = 86.3% ± 1%

Visualize the similarity matrix

White = high similarityBlack = low similarity

Sorted by brandsalphabetically

White in a square=

Users reviews are similar for all products in a brand

=Strong customer loyalty

There are structures!

“Organic & Natural”

Expensive!

There are structures! For example…

Cost effective

There are structures! For example…

Actionable InsightsFor Sephora.com:Send marketing emails to new customers of brands with stronger customer loyalty!

“Organic & Natural”

Expensive!

Cost effective

Chang LiuPhD. in Civil Engineering @CMUJ8D8L5@gmail.comlinkedin.com/in/changliucmu github.com/R4trtry

Is the rating a good measure of reviewers’ perspective?

• Trained a NaïveBaysian classifier for sentiment analysis

• W/ 250 thousand reviews from Birchbox.com

• A website that sends out free samples from smaller brands and gathers massive user reviews

Most common words Most informative feature

Word Count Negative Positive

skin 91349 re-wash Penny

product 82481 garbage hook

use 64044 mediocre gorgeous

love 55691 ketchup perk

feel 47879 trash stock

face 42615 unimpressive glowing

like 41427 survey splurge

great 34155 ineffective effortless

really 31672 gag Christmas

smell 27621 worthless happily

textquick take

Precision 95.3% 85.4%Recall 89.8% 93.1%

Worth every penny!

Another Validation

Is the rating a good measure of reviewers’ perspective?

Another Validation

Product X

Product Y

similarity 87.4%

Product X

Product Y

11

11

11

11

11

11

Reviewers

Product

X Y …

1 …

2 …

3 …

… …

… …

N …

Algorithm: Item-centric collaborative filter