A Linked Data-Based Decision Tree Classifier to Review Movies

10
Linked Data Mining Challenge at Know@LOD ESWC 2015, Portovoz, Slovenia, May 31st Suad Aldarra Emir Muñoz

Transcript of A Linked Data-Based Decision Tree Classifier to Review Movies

Page 1: A Linked Data-Based Decision Tree Classifier to Review Movies

Linked Data Mining Challenge at Know@LODESWC 2015, Portovoz, Slovenia, May 31st

Suad Aldarra

Emir Muñoz

Page 2: A Linked Data-Based Decision Tree Classifier to Review Movies

Imagine a movie

Vampire Female

Swords & Axes

Leather Pants

Chasing & Stabbing

Based on a video game Sequel

Non famous actors

Low budgetRate R

Page 3: A Linked Data-Based Decision Tree Classifier to Review Movies

Movies

Data

Training

Set

Test

SetFreeBase DBPedia

LOD Collection Movies

KB

Learner

ML Model

Predictor

Evaluation

IMDB OMDB Metacritics

How to pick a good movie?

Page 4: A Linked Data-Based Decision Tree Classifier to Review Movies

Sequel Film

Independent film

Based on literature

Freebase url

Actor Gender

Director Date Of

Birth

Actor Date Of Birth

OMDB API

MPAA rating

Runtime

Genre

Directors

Actors

language

Country

BudgetGross

Actor Awards

Director Awards

Plot keywords

Movie IMDB id

Critics Textual Reviews

#Female Actors

#MaleActors

#Actors>50#Actors<30

#Actors30-50

Directors Oscar/

Golden GlobeWin/Nominated

ActorsOscar/

Golden GlobeWin/Nominated

#Good Keywords#Bad Keywords#Mostly Good#Mostly Bad

HighBudgetLowBudget

Gross>BudgetCommon Language

Common Country

#positive reviews#negative reviews#neutral reviews

Director Gender

#Directors>50#Directors<30#Directors30-50

How to pick a good movie?Extracting Features

Release Date

Released_weekendReleased_weekday

Page 5: A Linked Data-Based Decision Tree Classifier to Review Movies

241 Features RDF Knowledge Base (SPARQL)

Weka Tool Decision Tree Algorithm (Best Performance, dealing with

nominal/numeric features, easy visualised)

Accuracy For Training Set 94 % (1503/2000)

How to pick a good movie?Training Classifier

Page 6: A Linked Data-Based Decision Tree Classifier to Review Movies

Accuracy For Test Set

91.75%

And the Oscars goes to ..

Page 7: A Linked Data-Based Decision Tree Classifier to Review Movies

Behind The Scenes

Decision Tree Diagram Critics Negative Reviews

Critics Negative Reviews

# Good Keywords

Genre: Documentary

+1 (352)

#Good Keywords

Language: English

Genre: Romance

#Bad Keywords

-1 (8)

+1 (3)

+1 (22)

Critics Positive Reviews

-1 (653/12)

#Good Keywords

#Bad Keywords

Language: German

#Actors Age <30

Release Date: Weekend

+1 (7)

<=0.4 >0.4

<=0.3 > 0.3 <=0.4 >0.4

Page 8: A Linked Data-Based Decision Tree Classifier to Review Movies

Behind The Scenes

Good Keywords Bad Keywords Common Keywords

1) frustration2) melancholy3) very little dialogue4) looking out a window5) film director6) sin7) reference to Friedrich

Nietzsche

8) old friend9) moral ambiguity10)dressing

1) critically bashed2) based on video game3) Taser4) pepper spray5) worst picture razzie winner

6) spin off from video game7) physical comedy8) hung upside down9) female vampire10)dark heroine

1) weapon2) tourist3) spider4) sexual abuse5) Santa Claus6) rome italy7) queen8) mentor9) hollywood California

10)black cop

Page 9: A Linked Data-Based Decision Tree Classifier to Review Movies

Ranked Features1) critics negative review2) critics positive review3) good keywords4) bad keywords5) country: USA6) genre: Documentary7) language : English8) mostly Good Keywords9) mostly Bad Keywords10) MPAA: PG-13

Behind The Scenes

Only 3 features from linked data in the top-10

• Linked Data is not enough alone

• DBpedia needs quality improvement and more interlinking

Page 10: A Linked Data-Based Decision Tree Classifier to Review Movies