Complex Relations Extraction
-
Upload
naveed-afzal -
Category
Documents
-
view
53 -
download
1
Transcript of Complex Relations Extraction
COMPLEX RELATIONS EXTRACTION
NAVEED AFZALSupervisor: Dr Mark Stevenson
MSc Advanced Software Engineering
Introduction Information Extraction (IE) is the process of deriving
useful information from unstructured text documents. Much of the recent research in the area of IE has
been focused on named identification and binary relations extraction.
This project investigates the problem of complex relations extraction. Complex relations are n-ary (n>2) relations between n entities.
The aim of complex relations extraction is to identify all the instances of interest in some piece of text, including incomplete sentences.
System’s Overview
MUC-6 Data (Soderland version)
Data Pre-Processing
Factorising into Binary Relations
POS Tagging
Labelling of Relations
Classifiers Training
Building of Graph
Rebuilding of Complex Relations
System’s Overview Factorising complex relations into binary
relations Train binary classifier for relatedness Build graph among related entities Rebuild complex relations from that
graph by finding maximal clique
Classifiers Result
00.10.20.30.40.50.60.70.80.9
1
Decision TreeClassifier
MaximumEntropy
Classifier
Naïve BayesClassifier
Training DataAccuracyTesting DataAccuracy
Experimental Results of Relations Classification
0
0.2
0.4
0.6
0.8
1
1.2
Decision TreeClassifier
MaximumEntropy
Classifier
Naïve BayesClassifier
PercisionRecallF-Score
Experimental Results of Events Classification
00.10.20.30.40.50.60.70.80.9
Decision TreeClassifier
MaximumEntropy
Classifier
Naïve BayesClassifier
PercisionRecallF-Score
Conclusion This project has presented an approach for complex relations
extraction in which the complex relations are first factorised into binary relations then different classifiers (Maximum Entropy, Naïve Bayes and Decision Tree) are trained to learn to identify binary relations.
In second phase, complex relations are reconstructed by finding maximal cliques in graphs that represent relations between pairs of entities.
Decision Tree classifier outperforms both Naïve Bayes and Maximum Entropy classifier in terms of precision, recall and F-score. Results produced by the Naïve Bayes classifier are relatively quite poor compare to Maximum Entropy and Decision Tree classifier.
Future Work This project have looked at the modified version of MUC-6 data
in which events are completely described with in a single sentence.
It will be motivating to investigate the events described in multiple sentences, see Stevenson (2004).
Moreover, this approach can be improved by using much deeper synthetic parsing and more powerful binary classifiers based on tree kernels Zelenko et al. (2003).
At the moment, it is using supervised learning algorithms and it would be interesting to investigate how this approach performs when using unsupervised learning algorithms.