Combining Contexts in Lexicon Learning for Semantic Parsing
description
Transcript of Combining Contexts in Lexicon Learning for Semantic Parsing
![Page 1: Combining Contexts in Lexicon Learning for Semantic Parsing](https://reader035.fdocuments.in/reader035/viewer/2022070415/56814f83550346895dbd36aa/html5/thumbnails/1.jpg)
1
Combining Contexts in Lexicon Learning for Semantic Parsing
May 25, 2007
NODALIDA 2007, Tartu, Estonia
Chris BiemannUniversity of Leipzig
Germany
Rainer OsswaldFernUniversität Hagen
Germany
Richard SocherSaarland UniversityGermany
![Page 2: Combining Contexts in Lexicon Learning for Semantic Parsing](https://reader035.fdocuments.in/reader035/viewer/2022070415/56814f83550346895dbd36aa/html5/thumbnails/2.jpg)
2
Outline• Motivation: lexicon extension for semantic parsing
• The semantic lexicon HaGenLex
• Binary features and complex sorts
• Method: bootstrapping via syntactic contexts
• Results
• Discussion
![Page 3: Combining Contexts in Lexicon Learning for Semantic Parsing](https://reader035.fdocuments.in/reader035/viewer/2022070415/56814f83550346895dbd36aa/html5/thumbnails/3.jpg)
3
Motivation• Semantic parsing aims at finding a semantic
representation for a sentence
• Semantic parsing needs as a prerequisite semantic features of words.
• Semantic features are obtained by manually creating lexicon entries (expensive in terms of time and money)
• Given a certain amount of manually created lexicon entries, it might be possible to train a classifier in order to find more entries
• Objective is Precision, Recall is secondary
![Page 4: Combining Contexts in Lexicon Learning for Semantic Parsing](https://reader035.fdocuments.in/reader035/viewer/2022070415/56814f83550346895dbd36aa/html5/thumbnails/4.jpg)
4
HaGenLex: Semantic Lexicon for German
complex sort
size: 22,700 entries of these: 13,000 nouns, 6,700 verbs
WORD SEMANTIC CLASSAggressivität nonment-dyn-abs-situationAgonie nonment-stat-abs-situationAgrarprodukt nat-discreteÄgypter human-objectAhn human-objectAhndung nonment-dyn-abs-situationÄhnlichkeit relationAirbag nonax-mov-art-discreteAirbus mov-nonanimate-con-potagAirport art-con-geogrAjatollah human-objectAkademiker human-objectAkademisierung nonment-dyn-abs-situationAkkordeon nonax-mov-art-discreteAkkreditierung nonment-dyn-abs-situationAkku ax-mov-art-discreteAkquisition nonment-dyn-abs-situationAkrobat human-object... ...
![Page 5: Combining Contexts in Lexicon Learning for Semantic Parsing](https://reader035.fdocuments.in/reader035/viewer/2022070415/56814f83550346895dbd36aa/html5/thumbnails/5.jpg)
5
Characteristics of complex sorts in HaGenLex
In total, 50 complex sorts for nouns are constructed from allowed combinations of:
• 16 semantic features (binary), e.g. HUMAN+, ARTIFICIAL- • 17 sorts (binary), e.g. concrete, abstract-situation...
sort (hierarchy)
semantic features
complex sorts
![Page 6: Combining Contexts in Lexicon Learning for Semantic Parsing](https://reader035.fdocuments.in/reader035/viewer/2022070415/56814f83550346895dbd36aa/html5/thumbnails/6.jpg)
6
Application: WOCADI-Parser
„Welche Bücher von Peter Jackson über Expertensysteme wurden bei Addison-Wesley seit 1985 veröffentlicht?“
![Page 7: Combining Contexts in Lexicon Learning for Semantic Parsing](https://reader035.fdocuments.in/reader035/viewer/2022070415/56814f83550346895dbd36aa/html5/thumbnails/7.jpg)
7
General Methodology
Distributional Hypothesis projected on syntactic-semantic contexts for nouns: nouns of similar complex sort are found in similar contexts
We use three kinds of context elements• Adjective Modifier• Verb-Subject (deep)• Verb-Object (deep)
as assigned by the WOCADI parser for training 33 binary classifiers.
![Page 8: Combining Contexts in Lexicon Learning for Semantic Parsing](https://reader035.fdocuments.in/reader035/viewer/2022070415/56814f83550346895dbd36aa/html5/thumbnails/8.jpg)
8
DataCorpus:• 3,068,945 sentences obtained from the Leipzig Corpora
Collection• parser coverage: 42%• verb-deep-subject relations: 430,916• verb-deep-object relations: 408,699• adjective-noun relations: 450,184
Lexicon• 11,100 noun entries• lexicon extension: 10-fold cross validation on known nouns• Also unknown nouns will be classified
![Page 9: Combining Contexts in Lexicon Learning for Semantic Parsing](https://reader035.fdocuments.in/reader035/viewer/2022070415/56814f83550346895dbd36aa/html5/thumbnails/9.jpg)
9
Algorithm:
Initialize the training set;As long as new nouns get classified { calculate class probabilities for each context element; for all yet unclassified nouns n { Multiply class probs of context elements class-wise; Assign the class with highest probabilities to noun n; }}
Class probabilities per context element:a) count number of per classb) normalize on total number of class wrt. noun classesc) normalize to row sum=1
A threshold regulates the minimum number of different context elements a noun co-occurs with in order to be classified
Bootstrapping Mechanism
![Page 10: Combining Contexts in Lexicon Learning for Semantic Parsing](https://reader035.fdocuments.in/reader035/viewer/2022070415/56814f83550346895dbd36aa/html5/thumbnails/10.jpg)
10
From binary classes to complex sorts• Binary classifiers for single features for all three context
element types are combined into one feature assignment:– Lenient: voting– Strict: all classifiers for different context types agree
• Combining the outcome: safe choices
ANIMAL +/-ANIMATE +/-ARTIF +/-AXIAL +/-... (16 features)
... (17 sorts)
ab +/-abs +/-ad +/-as +/-
Selection:compatible complex
sorts that are minimal w.r.t hierarchy and unambiguous.
result classor
reject
![Page 11: Combining Contexts in Lexicon Learning for Semantic Parsing](https://reader035.fdocuments.in/reader035/viewer/2022070415/56814f83550346895dbd36aa/html5/thumbnails/11.jpg)
11
Results: binary classes for different context types
=5
=1
most of the binary features are highly biased
![Page 12: Combining Contexts in Lexicon Learning for Semantic Parsing](https://reader035.fdocuments.in/reader035/viewer/2022070415/56814f83550346895dbd36aa/html5/thumbnails/12.jpg)
12
Combination of context types =1
![Page 13: Combining Contexts in Lexicon Learning for Semantic Parsing](https://reader035.fdocuments.in/reader035/viewer/2022070415/56814f83550346895dbd36aa/html5/thumbnails/13.jpg)
13
Results for complex sorts=5 =1
Complex sorts with highest
training frequency
![Page 14: Combining Contexts in Lexicon Learning for Semantic Parsing](https://reader035.fdocuments.in/reader035/viewer/2022070415/56814f83550346895dbd36aa/html5/thumbnails/14.jpg)
14
Typical mistakesPflanze (plant) animal-object instead of plant-objectzart, fleischfressend, fressend, verändert, genmanipuliert, transgen, exotisch, selten, giftig, stinkend,
wachsend...
Nachwuchs (offspring) human-object instead of animal-objectwissenschaftlich, qualifiziert, akademisch, eigen, talentiert, weiblich, hoffnungsvoll, geeignet, begabt,
journalistisch...
Café (café) art-con-geogr instead of nonmov-art-discrete (cf. Restaurant)Wiener, klein, türkisch, kurdisch, romanisch, cyber, philosophisch, besucht, traditionsreich, schnieke,
gutbesucht, ...
Neger (negro) animal-object instead of human-objectweiß, dreckig, gefangen, faul, alt, schwarz, nackt, lieb, gut, brav
but:
Skinhead (skinhead) human-object (ok){16,17,18,19,20,21,22,23,30}ährig, gleichaltrig, zusammengeprügelt, rechtsradikal, brutal
In most cases the wrong class is semantically close. Evaluation metrics did not account for that.
![Page 15: Combining Contexts in Lexicon Learning for Semantic Parsing](https://reader035.fdocuments.in/reader035/viewer/2022070415/56814f83550346895dbd36aa/html5/thumbnails/15.jpg)
15
Discussion of ResultsBinary features:• Precision >98% for most binary features• Assigning the smaller class is hard for bias>0.9
Context types• verb-subject and verb-object are better than adjective• verb-subject is best single context for complex sorts • combination always helps for binary features
Complex sorts• Todo: more lenient combination procedure to increase
recall
![Page 16: Combining Contexts in Lexicon Learning for Semantic Parsing](https://reader035.fdocuments.in/reader035/viewer/2022070415/56814f83550346895dbd36aa/html5/thumbnails/16.jpg)
16
Conclusion
• Method for semantic lexicon extension• High precision for binary semantic features• Unknown nouns:
– For 3,755 nouns not in the lexicon, a total of 125,491 binary features was assigned.
– For 1,041 unknown nouns, a complex sort was assigned
• Combination to complex sorts yet to be improved• Combination of different context types improves
results
![Page 17: Combining Contexts in Lexicon Learning for Semantic Parsing](https://reader035.fdocuments.in/reader035/viewer/2022070415/56814f83550346895dbd36aa/html5/thumbnails/17.jpg)
17
Any Questions?
Thank you very much!