Joachim Biskup Universität Dortmund and David W. Embley Brigham Young University
Ontology-based Information Extraction with a Cognitive Agent Peter Lindes 1, Deryle Lonsdale, David...
-
Upload
georgina-watts -
Category
Documents
-
view
215 -
download
1
Transcript of Ontology-based Information Extraction with a Cognitive Agent Peter Lindes 1, Deryle Lonsdale, David...
AAAI 2015 - IE with a Cognitive Agent 1
Ontology-based Information Extraction with a Cognitive Agent
Peter Lindes1, Deryle Lonsdale, David EmbleyBrigham Young University
AAAI 2015
1Now at University of Michigan
© 2015 Peter Lindes
1/22/2015
AAAI 2015 - IE with a Cognitive Agent 2
The Problem
1/22/2015
AAAI 2015 - IE with a Cognitive Agent 3
Goals and Strategies• OntoSoar project goals
– Extract genealogy facts from family history books– Project extracted information onto a conceptual model to
populate a searchable database• Strategies
– Use ideas from Embodied Construction Grammar– Use the Soar cognitive architecture– Integrate several levels of knowledge
• Long term goals– Build computational models of human language processing– Apply these models to real-world applications
1/22/2015
AAAI 2015 - IE with a Cognitive Agent 4
Example 1
1/22/2015
AAAI 2015 - IE with a Cognitive Agent 5
A Simple Ontology
1/22/2015
Charles Christopher Lathrop
has
18651817
born on died on
AAAI 2015 - IE with a Cognitive Agent 6
Example2
1/22/2015
AAAI 2015 - IE with a Cognitive Agent 7
A More Complex Ontology
1/22/2015
Myra Harwood
Jonathan Squires
J. Wilbur Squires
Feb. 13, 1874
AAAI 2015 - IE with a Cognitive Agent 8
The Solution
1/22/2015
Thus, intelligence is the ability to bring to bear all the knowledge that one has in service of one’s goals.
Newell (1990), p. 90
Page Layout *Text Analysis
SyntaxSemanticsPragmatics
World KnowledgeConceptual Models
AAAI 2015 - IE with a Cognitive Agent 9
OntoSoar Architecture
1/22/2015
PDF Tools PopulatedUser
Ontology(OSMX)
Segmenter LG Parser Meaning Builder
Conceptual Semantic Analyzer
Mapper
Segment Rules(37)
Link Grammar
Grammar Constructions
(16)
Inference Rules
Text Segments Linkages Meaning Schemas
Knowledge Structures Facts
User Ontology (OSMX)
Soar
OntoESTool Set
(A total of 260 Soar productions)
AAAI 2015 - IE with a Cognitive Agent 10
Construction Grammar
1/22/2015
REF-EXPR LE-VERB DATE
LIFE-EVENT
DateDateLifeEventLifeEventPersonPerson
REF-EXPR
SON-OF
PersonPersonSonOfSonOfPersonPerson
REF-EXPRson of
AAAI 2015 - IE with a Cognitive Agent 11
Applying Constructions
1/22/2015
Charles Christopher Lathrop, N. Y. City, b. 1817, d. 1865, son of Mary Ely and Gerard Lathrop ;
… More Constructions
1/22/2015 AAAI 2015 - IE with a Cognitive Agent 12
Charles Christopher Lathrop, N. Y. City, b. 1817, d. 1865, son of Mary Ely and Gerard Lathrop ;
Name
Person
Name
Person
Name
Person
SonOf
SonOf
AAAI 2015 - IE with a Cognitive Agent 13
Building Knowledge
1/22/2015
Charles Christopher Lathrop, N. Y. City, b. 1817, d. 1865, son of Mary Ely and Gerard Lathrop ;
Person gender: M name: K͞Gerard Lathrop L͞ birth: death:
Person gender: M name: K͞Charles C. Lathrop L͞ birth: K͞1817 L͞ death: K͞1865 L͞
Person gender: F name: K͞Mary Ely L͞ birth: death:
Couplemarried:
wifehusband
child parents
AAAI 2015 - IE with a Cognitive Agent 14
Knowledge Structures Compared
1/22/2015
… his widow married JONATHAN SQUIRES, who was born in Ohio, July 25, 1823, by whom she had one son, J. Wilbur, born June 16, 1865,
Charles Christopher Lathrop, N. Y. City, b. 1817, d. 1865, son of Mary Ely and Gerard Lathrop ;
AAAI 2015 - IE with a Cognitive Agent 15
Results on Examples
1/22/2015
AAAI 2015 - IE with a Cognitive Agent 161/22/2015
CCL Myra OD1 OD2 OD3 OD4 OD5 OD6 OD7 OD8 OD9 OD10 OD11 OD120
20
40
60
80
100
120
Data Accuracy for Test Data
Persons Births and Deaths Marriages Children
AAAI 2015 - IE with a Cognitive Agent 17
Results on The Ely Ancestry
1/22/2015
Item Type Instance Found
Persons 16,848
Births 8,609
Deaths 2,406
Genders 1,674
Couples 3,343
Children 3,049
Total 35,929
a book of 830 pages, including our Example 1
AAAI 2015 - IE with a Cognitive Agent 18
Conclusions
Contributions• Produces usable genealogy
data from scanned books• Does this using:
– Integration of several levels of knowledge
– An adaptation of Embodied Construction Grammar
– A cognitive architecture (Soar)
Future Work• Integrate parsing with
semantics• Develop a means to learn
many new constructions• Adapt to varying book styles• Scale up to perform well on
100’s of thousands of books
1/22/2015
It works! … and, it could work a lot better.