Using Neural Networks to aggregate Linked Data rules

Using Neural Networks to aggregate Linked Data rules

Ilaria Tiddi, Mathieu d’Aquin, Enrico Motta

Knowledge Discovery patterns

• Useful, explicit information about some collected data.

• Problems

Quantity - too many

Quality - not interesting

raw data

clean data

Knowledge

PATTERNS

Preprocessing

Mining

Interpreting

Dedalo: interpreting patterns with Linked Data

• Given the patterns of a clustering process

• Find explanations for one cluster with information from Linked Data

KMi researchers grouped together according to their co-authorship

• Linked Data can help the non experts in understanding the pattern

Dedalo: interpreting patterns with Linked Data

“Researchers working on projects led by someone interested in Semantic Web”

Dedalo: the bottleneck

• Explanations are atomic.“People working with Enrico Motta” F1=66%

“People working interested in Semantic Web” F1=66%

“People working interested in Ontologies” F1=66%

• We want to aggregate them to improve the explanation of the cluster“People working with Enrico Motta OR interested in Semantic Web”

F1=93%

“People interested in Semantic Web AND Ontologies” F1=86%

Rule aggregation state of the art

• Pre-production of patterns

• Machine Learning techniques

• Artificial Neural Networks (ANN) for features reduction

• Patterns post-processing

• Interestingness measures (IR, Statistics…)

• Semantic knowledge (ontologies, taxonomies)

Proposition:

use ANNs for post-processing patterns(i.e. for aggregating Linked Data rules)

Using ANNs to combine rules

• We want to know if two rules r1 and r2 are worth combining

• There must be a relationship between features of two rules that can help in deciding their combination

• We know their F-score, hence Precision and Recall

…(r1)

…(r2)

…(r1)

UNION(r1,r2)

INTERSECTION(r1,r2)

Model training and testing

• Model configuration

• Model : Feedforward multilayer perceptron

• Neurons structure: 9 – 12 – 2

• Inputs : P(r1), P(r2), R(r1), R(r2), F1(r1), F1(r2), and their absolute

differences

• Training and test set

• 30,000 automatically labeled combinations (unions and intersections)

for training

• Boolean label if the F1 of the combination has increased

• 30,000 combinations for testing

• Error rate: 0.24 (MSE rate)

Predicting combinations

• A process to combine rules

• a large set of ranked rules

• the nnet learned model

• a prediction indicator p(r1,r2)= nnet(r1,r2)*max(f(r1),f(r2))

• Start from the top(H) rule

• predict p(top(H),ri) for each rule in H

• combine rules if p(top(H),ri) above a given threshold

• add the new rule

Experiments - datasets

• KMiA – authors clustered according to the papers written together

• KMiP – papers clustered according to the abstracts words

• Huds – students clustered according to the books they borrowed

Data Rules RAM Time (sec) Initial top(H) Ending top(H)

KMiA1 369 4G 60’’ 71.1% 86.3%

KMiA2 511 4G 60’’ 60.6% 63.9%

KMiP1 747 4G 75’’ 54.9% 63.9%

KMiP2 1746 4G 160’’ 30.6% 84.1%

Hud1 11,937 10G 2,500’’ 20.2% 66.9%

Hud2 11,151 10G 3,000’’ 13.3% 67.3%

Experiments - strategies

Compare the NNET process with other strategies

Random baseline combine a random rule with the top(H)

AllComb baseline combine everything in H with everything in H

Top100 naïve combine the first 100 rules in H only

First naïve always combine the top(H) with H

Delta combine all rules above a threshold

NNET combine any pair predicted

NNET50combine if prediction is higher than 50% ofthe highest score at the current iteration

Experiments - results

Huds1 example.

Random

AllComb

Top100

First

Delta

NNET

NNET500.2

0.25

0.3

0.35

0.4

0.45

0.5

0.55

0.6

0.65

0.7

0 500 1000 1500 2000 2500

Experiments - results

0.6

0.65

0.7

0.75

0.8

0.85

0.9

0.95

1

0 10 20 30 40 50 600.7

0.72

0.74

0.76

0.78

0.8

0.82

0.84

0.86

0.88

0 10 20 30 40 50 60

KMiP1 example.

KMiP2 example. Huds2 example.

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0 500 1000 1500 2000 2500 3000

0.3

0.4

0.5

0.6

0.7

0.8

0.9

0 30 60 90 120 150

KMiA2 example.

0.55

0.57

0.59

0.61

0.63

0 10 20 30 40 50 60 70

KMiA1 example.

Experiments - performance

• Comparing performances

Speed Accuracy Scalability

Random ++ - ++

AllComb + ++ --

Top100 + -- +

First + -- --

Delta + -- --

NNET /NNET50

++ ++ ++

Conclusions and future work

• An approach to predict rule combination based on

Artificial Neural Networks

• Trained model on the information (Precision, Recall,

F-score) about the rules

• Save time and computational costs (vs. other)

• Evaluating Dedalo on Google Trends: why is a trend

popular according to Linked Data?

http://linkedu.eu/dedalo/eval/

ilaria.tiddi @open.ac.uk

@IlaTiddi

THANKS FOR YOUR ATTENTION

Questions?

Using Neural Networks to aggregate Linked Data rules

Presentations & Public Speaking

Transcript of Using Neural Networks to aggregate Linked Data rules