Mei gao practicedemo_1

10
Problem: What/Where to eat? Mei Gao Yelp Challenge Dataset

Transcript of Mei gao practicedemo_1

Problem:What/Where to eat?

Mei Gao

Yelp Challenge Dataset

Topic0 Topic1 Topic2 Topic3 Topic4 Topic 5

Japanese Mexican BrunchBar/

Pizza ComplimentAtmosphere

sushi tacos breakfast great pizza great

roll mexican coffee beer crust best

pita salsa eggs happyhour wings love

tuna burrito bacon bar thin good

salmon chips pancakes drinks pepperoni like

Topic6 Topic7 Topic8 Topic9 Topic10 Topic 11

Indian Asian Fastfood Sweets BBQ Service(bad)

Indian Thai burger bagels cheese service

buffet Pho fries cheese bbq didn’t

masala Chinese potato best sauce never

naan soup Onion ring smoothies chicken even

bianco curry dog Iove ribs back

LDA (Latent DiriChlet Allocation) 12 topics

Good restaurant: average star>3.5 Bad restaurant: average star<=3.5

Classification : Weight for each topic

Classifier Linear SVM Logistic Regression

Random Forest

Accuracy in Cross Validation

73.67% 81.19% 77.7%

Evaluation of Recommendation Error Using Normalized Distance-based Performance Measure (NDPM)

Recommendation with LDA

Rank by user's actual ratings

Restaurant_1 Restaurant_1

Restaurant_2 Restaurant_3

Restaurant_3 Restaurant_7

Restaurant_4 Restaurant_2

Restaurant_5 Restaurant_4

Restaurant_6 Restaurant_9

Restaurant_7 Restaurant_8

Restaurant_8 Restaurant_5

Restaurant_9 Restaurant_10

Restaurant_10 Restaurant_6

Assessment of LDABOW (Bag of Words) LDA

Feature Dimension 10000 words in dictionary 15 topics

>99% dimension reduction

Computation Efficiency 2.5 hrs 15 min

>90% computation

time

(2000 samples)(10 fold cross validation)

DEEP LEARNING FOR IMAGE RANKING

Deep hierarchical abstraction Learning structure of images

Topic0 Topic1 Topic2 Topic3 Topic4

Japanese Mexican brunch Bar/ pizzaAtmosphere

sushi tacos breakfast great pizza

roll mexican coffee beer crust

pita salsa eggs happyhour wings

tuna burrito bacon bar thin

salmon chips pancakes drinks pepperoni

Topic5 Topic6 Topic7 Topic8 Topic9

Indian Asian fastfood sweets bbq

Indian Thai burger bagels cheese

buffet Pho fries cheese bbq

masala Chinese potato best sauce

naan soup Onion ring smoothies chicken

bianco curry dog Iove ribs

LDA (Latent DiriChlet Allocation) 15 topics

Topic 10: ComplimentGreat, best, live, good, like

Topic 11: Service ( Bad)Service, didn’t, never, even, back

Assessment of LDA

Dimension Reduction: 99% reduction in dimension

BOW (bag of words) features: 10,000 LDA features: 15 Topics

Computation efficiency: 2000 samples, 10 fold cross validation

BOW features: 2.5 hrs LDA features: 15 min

3%4%

10%

4%

5%

29%

20%

6%

3%

16%

Percentage

others Japanese Mexican Brunch barService compliment Asian fastfood bbq