Acquisition of Semantic Classes for Adjectives from Distributional Evidence

33
Acquisition of Semantic Classes for Adjectives from Distributional Evidence Gemma Boleda Universitat Pompeu Fabra Barcelona

description

Acquisition of Semantic Classes for Adjectives from Distributional Evidence. Gemma Boleda Universitat Pompeu Fabra Barcelona. general picture. automatic classification of adjectives Catalan according to broad semantic characteristics clustering syntactic evidence. motivation. - PowerPoint PPT Presentation

Transcript of Acquisition of Semantic Classes for Adjectives from Distributional Evidence

Page 1: Acquisition of Semantic Classes for Adjectives from Distributional Evidence

Acquisition of Semantic Classes for Adjectives from

Distributional Evidence

Gemma BoledaUniversitat Pompeu Fabra

Barcelona

Page 2: Acquisition of Semantic Classes for Adjectives from Distributional Evidence

general picture

• automatic classification of adjectives– Catalan

• according to broad semantic characteristics• clustering

– syntactic evidence

Page 3: Acquisition of Semantic Classes for Adjectives from Distributional Evidence

motivation

• Lexical Acquisition– infer properties of words– lexical bottleneck

• both symbolic and statistical approaches

• adjectives– determining NP reference

• the French general– establishing properties of entities

• this maimai is round and sweet

Page 4: Acquisition of Semantic Classes for Adjectives from Distributional Evidence

motivation

• initial motivation: POS-tagging– 55% remaining ambiguity involves adjectives

general francès: ‘French general’ or ‘general French’?• observations

– general tendencies in syntactic behaviour of adjectives– ... which correspond to broad semantic properties

• generalisation: best at semantic level– low-level tasks (POS-tagging)– initial schema for lexical semantic representation

Page 5: Acquisition of Semantic Classes for Adjectives from Distributional Evidence

approach

• no general, well established semantic classification– have to build and test ours!

• clustering: unsupervised technique– groups objects according to feature distribution– does not depend on pre-classification– provides insight into the nature of the data

• shallow approach to syntax: n-grams– limited syntactic distribution– local relationship to arguments=> test feasibility

rodó ‘round’ 0.4 0.4 0.2

dolç ‘sweet’ 0.5 0.4 0.1

francès ‘French’ 0.1 0.6 0.3

italià ‘Italian’ 0.05 0.5 0.45

Page 6: Acquisition of Semantic Classes for Adjectives from Distributional Evidence

outline

• adjective syntax and semantic classification• methodology• experiment 1• experiment 2• partial conclusions• outlook: rest of the thesis

Boleda, Badia, Batlle (2004)

Page 7: Acquisition of Semantic Classes for Adjectives from Distributional Evidence

outline

• adjective syntax and semantic classification• methodology• experiment 1• experiment 2• partial conclusions• outlook: rest of the thesis

Page 8: Acquisition of Semantic Classes for Adjectives from Distributional Evidence

adjective syntax

• default function: noun modifier (92%)– right of the noun (default position: 72%)– some to the left (‘epithets’: 28%)

• predicative uses unfrequent (7%), but significant

Page 9: Acquisition of Semantic Classes for Adjectives from Distributional Evidence

two-way classification

• number of arguments– unary: pilota vermella ‘red ball’– binary: professor gelós de la Maria ‘teacher jealous of

Maria’• ontological kind (Ontological Semantics)

– basic: vermell ‘red’– object: malaltia pulmonar ‘pulmonary disease’ (=>

lung)– event: propietat constitutiva ‘constitutive property’ (=>

constitutes)

Page 10: Acquisition of Semantic Classes for Adjectives from Distributional Evidence

Ontological Semantics

• coverage (ordinary cases)• machine tractability• explicit model of world: ontology

– vermell => attribute::colour::red(x)– pulmonar => related-to::lung(x)– constitutiu => event::benef::constitute(x)

• however: no commitment to particular framework

Page 11: Acquisition of Semantic Classes for Adjectives from Distributional Evidence

rationale

• observation: syntactic preferences correspond to semantic properties

• hypothesis: we can use syntactic features to infer semantic classes

Page 12: Acquisition of Semantic Classes for Adjectives from Distributional Evidence

outline

• adjective syntax and semantic classification• methodology• experiment 1• experiment 2• conclusions and future work

Page 13: Acquisition of Semantic Classes for Adjectives from Distributional Evidence

data and procedure

• 2283 adjectives>50 times in 16 million word Catalan corpus

• lemma and morphological info

• cluster the whole set– perform different tasks on different subsets

• tuning subset: choose features• Gold Standard: evaluation and analysis

Page 14: Acquisition of Semantic Classes for Adjectives from Distributional Evidence

features and feature selection• features:

– empirically chosen from blind distribution– double bigram, simplified POS-representation

ella diu que la pilota vermella és seva

she says that the ball red is hers

-3ey -2dd -1cn +1ve

• tuning subset: 100 adjectives– choose features (distribution)

Page 15: Acquisition of Semantic Classes for Adjectives from Distributional Evidence

Fig. A: Feature selection

Page 16: Acquisition of Semantic Classes for Adjectives from Distributional Evidence

analysis

• Gold Standard– 80 adjectives– annotated by 3 human judges, acceptable

agreement (92 and 84%, .72 and .74 kappa)

Page 17: Acquisition of Semantic Classes for Adjectives from Distributional Evidence

outline

• adjective syntax and semantic classification• methodology• experiment 1• experiment 2• partial conclusions• outlook: rest of the thesis

Page 18: Acquisition of Semantic Classes for Adjectives from Distributional Evidence

experiment 1: unary / binary

• final evaluation:10 features, raw percentage– clustering algorithm: k-means (cosine)

• predictions:– binary adjectives cooccur with prepositions

more frequently than unary ones– unary adjectives are more flexible

Page 19: Acquisition of Semantic Classes for Adjectives from Distributional Evidence

unary / binary: results

• agreement with Gold Standard: – 97%, kappa = 0.87 – comparable to humans

• features:

cl high low0 (un) -1cn +1prep

1 (bin) +1prep (-1cn)Fig. B: Clusters vs. unary/binary

unary (yellow)

binary (red)

Page 20: Acquisition of Semantic Classes for Adjectives from Distributional Evidence

outline

• adjective syntax and semantic classification• methodology• experiment 1• experiment 2• partial conclusions• outlook: rest of the thesis

Page 21: Acquisition of Semantic Classes for Adjectives from Distributional Evidence

experiment 2: basic / object / event

• final evaluation: 32 features, normalisation– clustering algorithm: k-means (cosine)

• predictions:– basic adjectives are flexible, work as epithets,

occur in predicative contexts, appear further from the noun

– object adjectives appear rigidly after the noun– event adjectives tend to occur in predicative

positions and do not act as epithets

Page 22: Acquisition of Semantic Classes for Adjectives from Distributional Evidence

basic / object / event: results

• agreement with Gold Standard: – 73%, kappa = 0.56– lower than humans

• features:

cl high low

0 (obj) -1cn -1ve1 (ev) +1prep

2 (bas) -1co +1ajFig C: Clusters vs. basic/event/object

object (yellow)

event (orange)

basic (red)

Page 23: Acquisition of Semantic Classes for Adjectives from Distributional Evidence

basic/object/event: error analysis

• something has gone wrong!– characterisation of event adjectives

Fig C: Clusters vs. basic/event/object

Fig D: Clusters vs. unary/binary

binary!unary event adjectivesbasic adjectives with an object reading (polysemy)

binary event adjectives

Page 24: Acquisition of Semantic Classes for Adjectives from Distributional Evidence

outline

• adjective syntax and semantic classification• methodology• experiment 1• experiment 2• partial conclusions• outlook: rest of the thesis

Page 25: Acquisition of Semantic Classes for Adjectives from Distributional Evidence

partial conclusions

• overall, results seem to back up:– use of syntax-semantics interface for adjectives– linguistic predictions as to relevant features and

differences across classes– shallow approach

• unary / binary: piece of cake– few binary adjectives, but worth spotting

(denote relationships)

Page 26: Acquisition of Semantic Classes for Adjectives from Distributional Evidence

partial conclusions

• basic / object / event: need reworking– object adjectives seem to be the most robust

class– variation in basic adjectives (default class),

polysemy– event adjectives: seem to behave much like

basic adjectives with respect to features chosen => redefine class?

Page 27: Acquisition of Semantic Classes for Adjectives from Distributional Evidence

outline

• adjective syntax and semantic classification• methodology• experiment 1• experiment 2• partial conclusions• outlook: rest of the thesis

Page 28: Acquisition of Semantic Classes for Adjectives from Distributional Evidence

outlook: rest of the thesis

• rethink classification• redefine features in light of results• integrate polysemy judgments into the

experiment and analysis• perform experiments with other corpora

Page 29: Acquisition of Semantic Classes for Adjectives from Distributional Evidence

classification

• what to do with event adjectives? cp.:– constitutiu ‘constitutive’ (“active”)– legible ‘readable’ (“passive”)– reproductor ‘reproducing’ (“active,

habituality”)• yet another parameter: gradability

– important for adjectives– should be easy to induce

Page 30: Acquisition of Semantic Classes for Adjectives from Distributional Evidence

better blind distribution or self-defined features?

empirical accurate sparseness objective

blind X X ?

self X?(depends on method)

X

• n-grams: sparseness, selection

• other features?– account for different levels of description

Page 31: Acquisition of Semantic Classes for Adjectives from Distributional Evidence

polysemy

• crucial aspect, explains much of results• difficult to integrate!

– meaningless kappa values• alternatives?

– clearer definition of polysemy within task– specific tests– other resources: dictionary?

Page 32: Acquisition of Semantic Classes for Adjectives from Distributional Evidence

other resources

• CUCWeb (208 million word)http://www.catedratelefonica.upf.es

• test whether “more data is better data” (Mercer and Church 1993: 18-19)– advantages and challenges of Web corpora

• current results: for verb subcategorisation experiment, results 12 points lower than using smaller, balanced, controled corpus

Page 33: Acquisition of Semantic Classes for Adjectives from Distributional Evidence

Acquisition of Semantic Classes for Adjectives from

Distributional Evidence

Gemma BoledaUniversitat Pompeu Fabra

Barcelona