Learning Probabilistic Relational Models using Non-Negative Matrix Factorization

Post on 16-Apr-2017

489 views 0 download

Transcript of Learning Probabilistic Relational Models using Non-Negative Matrix Factorization

Anthony Coutant, Philippe Leray, Hoel Le CapitaineDUKe (Data, User, Knowledge) Team, LINA

26th June, 2014

Learning Probabilistic Relational Models using Non-Negative Matrix Factorization

7ème Journées Francophones sur les Réseaux Bayésiens et les Modèles Graphiques Probabilistes

22 / 24

Context

• Probabilistic Relational Models (PRM)– Attributes uncertainty in Relational datasets

• Relational datasets: attributes + link

• PRM with Reference Uncertainty (RU) model link uncertainty

• Partitioning individuals necessary in PRM-RU

33 / 24

Problem & Proposal

• PRM-RU partition individuals based on attributes only

• We propose to cluster the relationship information instead

• We show that :

– Attributes partitioning do not explain all relationships

– Relational partitioning can explain attributes oriented relationships

44 / 24

Flat datasets – Bayesian Networks

• Individuals supposed i.i.d.

P(G1)A B

0,25 0,75

P(G2)A B

0,25 0,75

DatasetG1 G2 RA B 1stB A 1stB B 2ndB B 2nd

G1, G2

P(R|G1,G2) A,A A,B B,A B,B

1st division 0,8 0,5 0,5 0,2

2nd division 0,2 0,5 0,5 0,8

Grade 1

Ranking

Grade 2

55 / 24

Relational datasets – Relational schema

StudentIntelligence

Ranking

RegistrationGrade

Satisfaction

1,n1

Instance

Schema

CoursePhil101

Difficulté???

Note???

Registration#4563

Note???

Satisfaction???

StudentJane Doe

Intelligence???

Classement???

StudentJane Doe

Intelligencehigh

Ranking1st division

Registration#4563

Note???

Satisfaction???

Registration#4563

GradeA

Satisfactionhigh

CoursePhil101

Difficultyhigh

Evaluationhigh

CourseDifficulty

Evaluation

1,n 1

66 / 24

Probabilistic Relational Models (PRM) .

MEAN(G)

P(R|MEAN(G)) A B

1st division 0,8 0,2

2nd division 0,2 0,8

PRM

Schema

Instance

StudentIntelligence

Ranking

RegistrationGrade

Satisfaction

1,n1CourseDifficulty

Evaluation

1,n 1

Evaluation Intelligence

Grade

Satisfaction

Difficulty Ranking

Course Registration Student

MEAN

MEAN

CourseMath

Difficulté???

Note???

Registration#6251

Note???

Satisfaction???

StudentJohn Smith

Intelligence???

Classement???

StudentJane Doe

Intelligence???

Ranking???

Registration#5621

Note???

Satisfaction???

Registration#4563

Grade???

Satisfaction???

CoursePhil

Difficulty???

Evaluation???

Instance

77 / 24

Probabilistic Relational Models (PRM) ..

MEAN(G)

P(R|MEAN(G)) A B

1st division 0,8 0,2

2nd division 0,2 0,8

PRM

Schema

CourseMath

Difficulté???

Note???

Registration#6251

Note???

Satisfaction???

StudentJohn Smith

Intelligence???

Classement???

StudentJane Doe

Intelligence???

Ranking???

Registration#5621

Note???

Satisfaction???

Registration#4563

Grade???

Satisfaction???

CoursePhil

Difficulty???

Evaluation???

Instance

Evaluation Intelligence

Grade

Satisfaction

Difficulty Ranking

Course Registration Student

MEAN

MEAN

Math.Diff

#4563.Grade

#5621.Grade

#6251.Grade

MEAN

GBN (Ground Bayesian Network)

Math.Eval

Phil.Diff

Phil.Eval#4563.Satis #5621.Satis

#6251.Satis

MEAN

JD.Int

JS.Int

JD.Rank

JS.RankMEAN

MEAN

Instance

StudentIntelligence

Ranking

RegistrationGrade

Satisfaction

1,n1CourseDifficulty

Evaluation

1,n 1

88 / 24

Uncertainty in Relational datasets

CoursePhil101

Difficulté???

Note???

Registration#4563

Note???

Satisfaction???

StudentJane Doe

Intelligence???

Classement???

StudentJane Doe

Intelligence???

Ranking???

Registration#4563

Note???

Satisfaction???

Registration#4563

Grade???

Satisfaction???

CoursePhil101

Difficulty???

Evaluation???

StudentJane Doe

Intelligence???

Ranking???

StudentJane Doe

Intelligence???

Ranking???

Registration#4563

Note???

Satisfaction???

Registration#4563

GradeA

Satisfaction???

CoursePhil101

Difficulté???

Note???

CoursePhil101

Difficulty???

Evaluationhigh

CoursePhil101

Difficulté???

Note???

Registration#4563

Note???

Satisfaction???

StudentJane Doe

Intelligence???

Classement???

StudentJane Doe

Intelligence???

Ranking???

Registration#4563

Note???

Satisfaction???

Registration#4563

Grade???

Satisfaction???

CoursePhil101

Difficulty???

Evaluation???

StudentJane Doe

Intelligence???

Ranking???

StudentJane Doe

Intelligence???

Ranking???

Registration#4563

Note???

Satisfaction???

Registration#4563

GradeA

Satisfaction???

CoursePhil101

Difficulté???

Note???

CoursePhil101

Difficulty???

Evaluationhigh

?

Attributes uncertainty (PRM)

Attributes and link uncertainty (PRM extensions)

?

99 / 24

• Reference uncertainty: P(r.Course = ci, r.Student = sj | r.exists = true)• A random variable for each individual id? Not generalizable• Solution: partitioning

Difficulty Intelligence

Course StudentRegistration

Student

Evaluation RankingCourse

P(Student | Course.Difficulty)?

P(Course)?

PRM with reference uncertainty .

1010 / 24

• P(Student | ClusterStudent) follows a uniform law

Difficulty Intelligence

Course StudentRegistration

ClusterCourse

Course

ClusterStudent

Student

P(CStudent | S.Intelligence)

low high

C1 0 1

C2 1 0

P(Student | CStudent)

C1 C2

s1 0 1

s2 1 0

Evaluation Ranking

PRM with reference uncertainty ..

1111 / 24

• P(Student | ClusterStudent) follows a uniform law

Difficulty Intelligence

Course StudentRegistration

ClusterCourse

Course

ClusterStudent

Student

P(CStudent | S.Intelligence)

low high

C1 0 1

C2 1 0

P(Student | CStudent)

C1 C2

s1 0 1

s2 1 0

Evaluation Ranking

PRM with reference uncertainty ..

highlow Biolow

high C1

C2

Students Population stats

50% 50%Partition Function

1212 / 24

Attributes-oriented Partition Functions in PRM-RU

• PRM-RU: Clustering from attributes• Assumption: attributes explain the relationship• Not generalizable, relationship information not used for partitioning

Course StudentP(Green | Red) = 1P(Purple | Blue) = 1

YES

1313 / 24

Attributes-oriented Partition Functions in PRM-RU

• PRM-RU: Clustering from attributes• Assumption: attributes explain the relationship• Not generalizable, relationship information not used for partitioning

Course StudentP(Green | Red) = 1P(Purple | Blue) = 1

Course StudentP(Green | Red) = 1P(Purple | Blue) = 1

YES IS THAT SO?

1414 / 24

Attributes-oriented Partition Functions in PRM-RU

• PRM-RU: Clustering from attributes• Assumption: attributes explain the relationship• Not generalizable, relationship information not used for partitioning

Course Student Course StudentP(Green | Red) = 1P(Purple | Blue) = 1

P(Green | Red) = 0.5P(Purple | Red) = 0.5

Course StudentP(Green | Red) = 0.5P(Purple | Red) = 0.5

Course StudentP(Green | Red) = 1P(Purple | Blue) = 1

YES NOIS THAT SO?

1515 / 24

Relationship-oriented Partitioning

• Objective: finding partitioning maximizing intra-partition edges

Course Student

P(Student.p1 | Course.p1) = 1P(Student.p2 | Course.p2) = 1

p1

p2Course Student

P(Green | Red) = 0.5P(Purple | Red) = 0.5

1616 / 24

Experiments – Protocol – Dataset generation

Entity 2Att 1

…Att n

R1,n 1

Entity 1Att 1

…Att n

1 1,n

Schema

Instance

Entity 1 Entity 2R

1717 / 24

Experiments – Protocol – Dataset generation

Entity 2Att 1

…Att n

R1,n 1

Entity 1Att 1

…Att n

1 1,n

Schema

Instance

Entity 1 Entity 2

Attributes partitioning favorable case

Relationship partitioning favorable case

Entity 1 Entity 2

R

R

1818 / 24

Experiments – Protocol – LearningEntity 1 Entity 2Relation

Att n

Att 1

Att n

Att 1 CE1

CE2

E2

E1

• Parameter learning on set up structure• 2 PRM compared:– Either with attributes partitioning– Or with relational partitioning

1919 / 24

Experiments – Protocol – Evaluation• For each generated dataset D – Split D into 10 subsets {D1, …, D10}

– Perform 10 Folds CV each with one Di for test and others for training• Do it for PRM with attributes partitioning : store the results of 10 log likelihood PattsLL[i]• Do it for PRM with relationship partitioning : store the results of 10 log likelihood PrelLL[i]

– Evaluate mean and sd of PattsLL[i] and PrelLL[i]

– Evaluate significancy of relationship partitioning over attributes partitioning

2020 / 24

Experiments – ResultsRandom clusters (independent from attributes)

k2 4 16

n

25

50

100

200

Relational > Attributes partitioningAttributes > Relational partitioningPartitionings not significantly comparable

k2 4 16

n

25

50

100

200

Attributes => Cluster(fully dependent from attributes)

Green: Red:

Orange:

2121 / 24

Experiments – About the NMF choice for partitioning

• NMF– Find low dimension factor matrices which product approximates the original matrix– A relationship between two entities is an adjacency matrix

• Motivation for NMF usage– (Restrictively) captures latent information from both rows and columns: co-clustering– Several extensions dedicated to more accurate co-clustering (NMTF)

– Extensions for Laplacian regularization• Allow to capture both attributes and relationship information for clustering

– Extensions for Tensor factorization• Allow to model n-ary relationships, n >= 2

– NMF = Good starting choice for the long-term needs?

2222 / 24

Experiments – About the NMF choice for partitioning

• But– Troubles with performances in experimentations– Very sensitive to initialization: crashes whenever reaching singular

state

– Moving toward large scale methods : graph based relational clustering?

2323 / 24

Conclusion

• PRM-RU to define probability structure in relational datasets• Need for partitioning• PRM-RU use attributes oriented partitioning• We propose to cluster the relationship information instead• Experiments show that :– Attributes partitioning do not explain all relationships– Relational partitioning can explain attributes oriented relationships

2424 / 24

Perspectives

• Experiments on real life datasets

• Towards large scale partitioning methods

• PRM-RU Structure Learning using clustering algorithms

• What about other link uncertainty representations?

Anthony Coutant, Philippe Leray, Hoel Le CapitaineDUKe (Data, User, Knowledge) Team, LINA

Questions?

7ème Journées Francophones sur les Réseaux Bayésiens et les Modèles Graphiques Probabilistes

(anthony.coutant | philippe.leray | hoel.lecapitaine)@univ-nantes.fr