Natural Language Interfaces to Ontologies LarKc PhD symphosium, Beijing, 14 November 2010 Danica...
-
Upload
tatum-osment -
Category
Documents
-
view
212 -
download
0
Transcript of Natural Language Interfaces to Ontologies LarKc PhD symphosium, Beijing, 14 November 2010 Danica...
Natural Language Interfaces to Ontologies
LarKc PhD symphosium, Beijing, 14 November 2010
Danica DamljanovićUniversity of Sheffield
3
What are Natural Language Interfaces to Ontologies?
Customisation
Ontology editing (e.g. using Protege)
Domain lexicon
NLI for querying
…
Domain knowledge
WordNet
Domain expert
Ontology engineer NLI for Ontology authoring
The Objective
• Increase usability of Natural Language Interfaces to ontologies– For end users: increase precision and recall– For application developers: decrease the time for
customisation
Previous Work: QuestIO
1.15
1.19
compare
But...
• Ontologies are not perfect:– ontology lexicalisations often missing or too many– ranking based on ontology structure might be
misleading• Encouraging users to use keywords might be
misleading• User evaluation:– defined tasks: user satisfaction reaching 90%– undefined tasks: user satisfaction low (~44%)
13
• Feedback: showing the user system interpretation of the query• Refinement:
– resolving ambiguity: generating dialog whenever one term refers to more than one concept in the ontology (precision)
• Extended Vocabulary:– expressiveness: generating dialog whenever an “unknown” term appears
in the question (recall)– portability: no need for customisation from application developers
• The dialog:– generated by combining the syntactic parsing and ontology-based lookup– learns from the user’s selections
FREyA - Feedback, Refinement, Extended Vocabulary Aggregator
14
FREyA Workflow
• Potential Ontology Concept (POC)
• Ontology Concept (OC)
answer
answer
NL query
POCsOCs
triples
SPARQL
learn
Indentify the Answer Type
Answer Type
Find Potential Ontology Concepts
CNL 2010, Marettimo, Sicily 16
17
Finding Ontology Concepts
18geo:City
geo:State new york
POC
POC
population
geo:cityPopulation
Mapping POC to OCs: Ambiguities
geo:State
19
New York is a city
20
New York is a state
21
Ambiguous Lexicon
POC OC (context) candidate OC function
new york geo:State -
new york geo:City -
population geo:State geo:statePopulation -
population geo:City geo:cityPopulation -
IF THEN
22
POC
POC
POC
state
areageo:stateArea
geo:State
geo:isLowestPointOf
point
The User Controls the Output
maxgeo:LoPoint
geo:loElevation
min
23
TRIPLES:?firstJoker – geo:isLowestPointOf – geo:Stategeo:State – (max) geo:stateArea - ?lastJoker
SPARQL:prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>prefix xsd: <http://www.w3.org/2001/XMLSchema#>select ?firstJoker ?p0 ?c1 ?p2 ?lastJoker where { { { ?c1 ?p0 ?firstJoker} UNION { ?firstJoker ?p0 ?c1} . filter (?p0=<http://www.mooney.net/geo#isLowestPointOf>) . } ?c1 rdf:type <http://www.mooney.net/geo#State> . ?c1 ?p2 ?lastJoker . filter (?p2=<http://www.mooney.net/geo#stateArea>) . } ORDER BY DESC(xsd:double(?lastJoker))
WHAT IS THE LOWEST POINT OF THE STATE WITH THE LARGEST AREA?
24
WHAT IS THE LOWEST POINT OF THE STATE WITH THE LARGEST AREA?
TRIPLES:?firstJoker – (min) geo:loElevation – geo:LoPointgeo:LoPoint - ?joker3 – geo:Stategeo:State – (max) geo:stateArea - ?lastJoker
SPARQL:prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>prefix xsd: <http://www.w3.org/2001/XMLSchema#>select ?firstJoker ?p0 ?c1 ?joker3 ?c2 ?p3 ?lastJoker where { ?c1 ?p0 ?firstJoker . filter (?p0=<http://www.moony.net/geo#loElevation>) . ?c1 rdf:type <http://www.mooney.net/geo#LoPoint> . {{ ?c2 ?joker3 ?c1 } UNION { ?c1 ?joker3 ?c2 }} ?c2 rdf:type <http://www.mooney.net/geo#State> . ?c2 ?p3 ?lastJoker . filter (?p3=<http://www.mooney.net/geo#stateArea>) . } ORDER BY ASC(xsd:double(?firstJoker)) DESC(xsd:double(?lastJoker))
the answer for both is Death Valley
25
New Lexicon
POC OC (context) candidate OC function
area geo:State geo:stateArea -
largest geo:stateArea geo:stateArea max
point geo:State geo:LoPoint -
lowest geo:LoPoint geo:loElevation min
lowest geo:isLowestPointOf - -
IF THEN
27
Learning
ESWC 2010 28
FREyA: a Natural Language Interface to Ontologies
03 June 2010
http://gate.ac.uk/freya
31
Evaluation: correctness
Mooney GeoQuery dataset, 250 questions
34 no dialog, 14 failed to be answered
Precision=recall=94.4%
32
Evaluation: Learning 10-fold cross-validation , 202 Mooney GeoQuery questions that could be
correctly mapped into SPARQL and required dialog, from 0.25 to 0.48
Errors: ambiguity and sparseness
Evaluation: Ranking Mean Reciprocal Rank: 0.76 (default ranking based on string similarity and
synonym detection)
Learning the Correct Ranking Randomly selected 103 dialogs from 202 questions (343 dialogs)
MRR increased for 6% from 0.72 to 0.78
35
Evaluation: Answer Type
45.60%
53.20%
0.01%
Answer TypeCorrect (1 dialog)Correct (no dialog)Incorrect
Evaluation: Customisation
• Small empirical evaluation with 1 subject who is not familiar with ontologies and NLP
• No training, short introduction into the domain
• 17 questions asked in total; 3 were cancelled by the user during one of the dialogs• 78.57% correctly answered• 21.43% failed or incorrectly answered
38
Conclusion
• Combining syntactic parsing with ontology-based lookup through user interaction can increase the precision and recall of NLIs to ontologies,
• while reducing the time for customisation by shifting it from application developers to end users.
39
Next steps
• Improvement of the learning model to avoid errors due to ambiguities– point> geo:HiPoint or geo:LoPoint
• Using lexicon to improve other systems
More information...• D. Damljanovic, M. Agatonovic, H. Cunningham: Natural Language Interfaces
to Ontologies: Combining Syntactic Analysis and Ontology-based Lookup through the User Interaction. In Proceedings of the 7th Extended Semantic Web Conference (ESWC 2010), Springer Verlag, Heraklion, Greece, May 31-June 3, 2010. PDF
• D. Damljanovic, M. Agatonovic, H. Cunningham: Identification of the Question Focus: Combining Syntactic Analysis and Ontology-based Lookup through the User Interaction. In Proceedings of the 7th Language Resources and Evaluation Conference (LREC 2010), ELRA 2010, La Valletta, Malta, May 17-23, 2010. PDF
D. Damljanovic. Towards portable controlled natural languages for querying ontologies. In Rosner, M., Fuchs, N., eds.: Proceedings of the 2nd Workshop on Controlled Natural Language. Lecture Notes in Computer Science. Springer Berlin/Heidelberg, Marettimo Island, Sicily (September 2010)