Mining knowledge from natural language texts using fuzzy associated concept mapping
description
Transcript of Mining knowledge from natural language texts using fuzzy associated concept mapping
1Intelligent Database Systems Lab
N.Y.U.S.T.I. M.
Mining knowledge from natural language texts using fuzzy associated concept mapping
Presenter : Wu, Jia-Hao
Authors : W.M. Wang, C.F.Cheung,W.B. Lee, S.K. Kwork
IPM (2008)
˜
2Intelligent Database Systems Lab
N.Y.U.S.T.I. M.
2
Outline
Motivation
Objective
Methodology
Experiments
Conclusion
Comments
3Intelligent Database Systems Lab
N.Y.U.S.T.I. M.Motivation
The amount of data of all kinds available electronically is increasing dramatically. In the enterprises, about 80-98% of all data is consists of unstructured
or semi-structured documents.
Knowledge presented in may documents has an informal, unstructured shape. It has to be converted to a formal shape, with precisely defined syntax
and semantics. (ex: document annotations)
4Intelligent Database Systems Lab
N.Y.U.S.T.I. M.Objective
Extracting the propositions in text so as to construct a concept map automatically. The technique, Fuzzy Association Concept Mapping (FACM), is
consists of a linguistic module and a recommendation module.
Provides a method which can be easily convert by computer. Users can convert scientific and short texts into a structured format.
Provides knowledge workers with extra time to rethink their written text and to view their knowledge from another angle.
5Intelligent Database Systems Lab
N.Y.U.S.T.I. M.Objective (Cont.)
6Intelligent Database Systems Lab
N.Y.U.S.T.I. M.Methodology-FACM
The relations and concepts are generated from the document itself rather than retrieved from predefined ontologies. It uses the syntactic structure of the sentences to find relations between
the words.
An anaphoric resolution is applied based on rule-based reasoning (RBR) and case-based reasoning (CBR) for solving ambiguities arising during the syntactic analysis. This enables a dynamic method of anaphoric resolution that is
continually improved.
7Intelligent Database Systems Lab
N.Y.U.S.T.I. M.Methodology-Architecture of FACM.
Step 1.Input the Sentence.
Step 2.Parsing by POS tagger.
Step 3.Case encoding
Step 4.Produce the Solution.
8Intelligent Database Systems Lab
N.Y.U.S.T.I. M.Methodology-FACM’s Anaphora resolution
The similarity between the new case and old cases is calculated based on nearest neighbor matching.
(1)
(2)
9Intelligent Database Systems Lab
N.Y.U.S.T.I. M.Methodology-Proposition recommendation
The normalized frequency of concept i and concept j co-existing in the same or adjacent sentence is calculated:
10Intelligent Database Systems Lab
N.Y.U.S.T.I. M.Methodology-the relationship between concepts.
(a)(b)
(c)
IF the normalized frequency of two concepts co-existing in the same sentence is High, THEN the relationship between the two concepts is High(0.7).
IF the normalized frequency of two concepts co-existing in the adjacent sentence is High, THEN the relationship between the two concepts is Medium(0.2).
The COG of fuzzy set A on the interval a1 to a2 with membership function uA is given:
11Intelligent Database Systems Lab
N.Y.U.S.T.I. M.Experiments-SCI abstracts & News from CNET
12Intelligent Database Systems Lab
N.Y.U.S.T.I. M.Experiments-Results of algorithm evaluation
13Intelligent Database Systems Lab
N.Y.U.S.T.I. M.Conclusion
Provides an interactive way for concept map builders. Rethink their concept maps.
Adapt and Refine the suggestions for completing the concept maps.
A human-like construction of concept maps can be achieved. The highly accurate for use in extracting concepts from scientific and short texts
such as abstract databases, news groups, emails, discussion forums, etc.
Future work The system should be evaluated on bigger collections with more candidate users.
The evaluation of the interactive process of the framework is also an essential element.
Qualitative methods may be used to evaluate the effectiveness of the recommendation process.
14Intelligent Database Systems Lab
N.Y.U.S.T.I. M.Comments
Advantage The convenient mining knowledge method.
Drawback How to use the equation to produce the concept map.
Application To analyze Abstract.