Empirical Approaches to Multilingual Lexical Acquisitiontbaldwin/lexacq/lecture07.pdf · Empirical...

57
Empirical Approaches to Multilingual Lexical Acquisition Lecturer: Timothy Baldwin

Transcript of Empirical Approaches to Multilingual Lexical Acquisitiontbaldwin/lexacq/lecture07.pdf · Empirical...

Page 1: Empirical Approaches to Multilingual Lexical Acquisitiontbaldwin/lexacq/lecture07.pdf · Empirical Approaches to Multilingual Lexical Acquisition Lecture 7 (18/7/2008) Basic Method

Empirical Approaches toMultilingual Lexical Acquisition

Lecturer: Timothy Baldwin

Page 2: Empirical Approaches to Multilingual Lexical Acquisitiontbaldwin/lexacq/lecture07.pdf · Empirical Approaches to Multilingual Lexical Acquisition Lecture 7 (18/7/2008) Basic Method

Empirical Approaches to Multilingual Lexical Acquisition Lecture 7 (18/7/2008)

Lecture 7

Learning Verb Syntax

1

Page 3: Empirical Approaches to Multilingual Lexical Acquisitiontbaldwin/lexacq/lecture07.pdf · Empirical Approaches to Multilingual Lexical Acquisition Lecture 7 (18/7/2008) Basic Method

Empirical Approaches to Multilingual Lexical Acquisition Lecture 7 (18/7/2008)

Subcategorisation Frames

• “A subcategorisation (subcat) frame is a statement of what types

of arguments a verb ... takes as objects, infinitives, that-clauses,

participal clauses and subcategorised PPs” (Manning 1993):

John wants Mary to be happyJohn hopes that Mary is happy*John wants that Mary is happy*John hopes Mary to be happy

2

Page 4: Empirical Approaches to Multilingual Lexical Acquisitiontbaldwin/lexacq/lecture07.pdf · Empirical Approaches to Multilingual Lexical Acquisition Lecture 7 (18/7/2008) Basic Method

Empirical Approaches to Multilingual Lexical Acquisition Lecture 7 (18/7/2008)

Applications of Subcat Information

• Subcat information can lead to attachment disambiguation:

John put [the cactus] [on the table]

• Core component of type hierarchy in linguistically-precise grammars

• Empirical evidence for lexicalised subcat information improving

the performance of statistical parsers, WSD systems, information

extraction engines, etc.

3

Page 5: Empirical Approaches to Multilingual Lexical Acquisitiontbaldwin/lexacq/lecture07.pdf · Empirical Approaches to Multilingual Lexical Acquisition Lecture 7 (18/7/2008) Basic Method

Empirical Approaches to Multilingual Lexical Acquisition Lecture 7 (18/7/2008)

From Grammar to Lexicon:Unsupervised Learning of

Lexical Syntax

(Brent 1993)

4

Page 6: Empirical Approaches to Multilingual Lexical Acquisitiontbaldwin/lexacq/lecture07.pdf · Empirical Approaches to Multilingual Lexical Acquisition Lecture 7 (18/7/2008) Basic Method

Empirical Approaches to Multilingual Lexical Acquisition Lecture 7 (18/7/2008)

Basic Method

1. Identify verb tokens through a variety of heuristics

2. For each verb type, use high-precision lexico-syntactic patterns to

identify evidence for 6 different subcat frames

3. Use a statistical filter to remove noise in the extracted subcat data

Brent (1993) 5

Page 7: Empirical Approaches to Multilingual Lexical Acquisitiontbaldwin/lexacq/lecture07.pdf · Empirical Approaches to Multilingual Lexical Acquisition Lecture 7 (18/7/2008) Basic Method

Empirical Approaches to Multilingual Lexical Acquisition Lecture 7 (18/7/2008)

Identification of Verb Tokens

• Very rough and heuristicky — (just) before the days of reliable

POS tagging

• Focus on base and present participial verb forms

• Problems in distinguishing between base-form verbs and singular

nouns (e.g. record — only workaround a filter on the immediately

preceding word)

Brent (1993) 6

Page 8: Empirical Approaches to Multilingual Lexical Acquisitiontbaldwin/lexacq/lecture07.pdf · Empirical Approaches to Multilingual Lexical Acquisition Lecture 7 (18/7/2008) Basic Method

Empirical Approaches to Multilingual Lexical Acquisition Lecture 7 (18/7/2008)

Lexico-syntactic Patterns

• Based on closed-class words (pronouns, determiners,

complementisers, auxiliaries, punctuation)

• NPs captured in the form of pronouns or sequences of capitalised

words

• VPs based on auxiliaries and the verbs learned in step 1

Brent (1993) 7

Page 9: Empirical Approaches to Multilingual Lexical Acquisitiontbaldwin/lexacq/lecture07.pdf · Empirical Approaches to Multilingual Lexical Acquisition Lecture 7 (18/7/2008) Basic Method

Empirical Approaches to Multilingual Lexical Acquisition Lecture 7 (18/7/2008)

Statistical Filtering (1)

• Assumption that the probability of false evidence for a given subcat

frame S (e.g. transitive) occurring is equal for all verbs incompatible

with S (e.g. snore, put, say, ...)

• NOTE: probability of false evidence (π−S) constant for a given S

but varies across different subcat frames

• Null hypothesis: the verb does not belong to subcat class S, i.e. it

is −S

Brent (1993) 8

Page 10: Empirical Approaches to Multilingual Lexical Acquisitiontbaldwin/lexacq/lecture07.pdf · Empirical Approaches to Multilingual Lexical Acquisition Lecture 7 (18/7/2008) Basic Method

Empirical Approaches to Multilingual Lexical Acquisition Lecture 7 (18/7/2008)

Statistical Filtering (2)

• Binomial test: the probability of an event with probability p

occurring exactly m out of n times is given by

P (m,n, p) =n!

m!(n − m)!pm(1 − p)n−m

• The probability of the event occurring m or more times out of n is

given by

P (m+, n, p) =∑

ni=mP (i, n, p)

Brent (1993) 9

Page 11: Empirical Approaches to Multilingual Lexical Acquisitiontbaldwin/lexacq/lecture07.pdf · Empirical Approaches to Multilingual Lexical Acquisition Lecture 7 (18/7/2008) Basic Method

Empirical Approaches to Multilingual Lexical Acquisition Lecture 7 (18/7/2008)

mn P (m,n, p = 0.1) P (m+, n, p = 0.1)010 0.349 1.000110 0.387 0.651210 0.194 0.264310 0.057 0.070410 0.011 0.013510 0.001 0.002610 0.000 0.000710 0.000 0.000810 0.000 0.000910 0.000 0.0001010 0.000 0.000

Brent (1993) 10

Page 12: Empirical Approaches to Multilingual Lexical Acquisitiontbaldwin/lexacq/lecture07.pdf · Empirical Approaches to Multilingual Lexical Acquisition Lecture 7 (18/7/2008) Basic Method

Empirical Approaches to Multilingual Lexical Acquisition Lecture 7 (18/7/2008)

Statistical Filtering (3)

• Given n and p (= π−S), we can apply a threshold θ to determine

m such that verbs which occur with subcat frame S at least m

times can be classified as +S with (1 − θ) confidence

• In practice we don’t know π−S for each subcat frame S

SOLUTION: set θ and n, and estimate p based on the

histogram distribution around each m; select the p which best

fits the binomial distribution

Brent (1993) 11

Page 13: Empirical Approaches to Multilingual Lexical Acquisitiontbaldwin/lexacq/lecture07.pdf · Empirical Approaches to Multilingual Lexical Acquisition Lecture 7 (18/7/2008) Basic Method

Empirical Approaches to Multilingual Lexical Acquisition Lecture 7 (18/7/2008)

Shortcomings of the Brent Approach

• Assumption of π−S being equal for all verbs given a class S shown

to be flawed due to verb detection method

• Applicability of method to low-frequency words

• Scalability of method to other subcat frames

Brent (1993) 12

Page 14: Empirical Approaches to Multilingual Lexical Acquisitiontbaldwin/lexacq/lecture07.pdf · Empirical Approaches to Multilingual Lexical Acquisition Lecture 7 (18/7/2008) Basic Method

Empirical Approaches to Multilingual Lexical Acquisition Lecture 7 (18/7/2008)

An Update on more Recent Research

• Greater coverage of subcat frames (up to 160)

• Simple frequency shown to be at least as effective as binomial test

at filtering out noise

• Verb sense shown to interface closely with subcategorisation

properties

• AND YET the Brent method still has remarkable currency to this

day

Brent (1993) 13

Page 15: Empirical Approaches to Multilingual Lexical Acquisitiontbaldwin/lexacq/lecture07.pdf · Empirical Approaches to Multilingual Lexical Acquisition Lecture 7 (18/7/2008) Basic Method

Empirical Approaches to Multilingual Lexical Acquisition Lecture 7 (18/7/2008)

Open Questions

• How to deal with low-frequency occurrences of subcat frames

• How well do the proposed methods port to other word classes

(adjectives, nouns, ...) and languages

• Challenges for subcat acquisition in pro-drop languages (e.g.

Japanese)

14

Page 16: Empirical Approaches to Multilingual Lexical Acquisitiontbaldwin/lexacq/lecture07.pdf · Empirical Approaches to Multilingual Lexical Acquisition Lecture 7 (18/7/2008) Basic Method

Empirical Approaches to Multilingual Lexical Acquisition Lecture 7 (18/7/2008)

Alternations

(Baldwin and Bond 2002)

15

Page 17: Empirical Approaches to Multilingual Lexical Acquisitiontbaldwin/lexacq/lecture07.pdf · Empirical Approaches to Multilingual Lexical Acquisition Lecture 7 (18/7/2008) Basic Method

Empirical Approaches to Multilingual Lexical Acquisition Lecture 7 (18/7/2008)

Definition of Alternation

• A regular mapping between argument positions in subcategorisation

frames (generally assuming preservation of case-roles)

• Alternations involve at least one of:

i. word order/(prepositional, case, etc.) marking variation

between corresponding case slots

ii. case slot deletion

iii. case slot insertion

Baldwin and Bond (2002) 16

Page 18: Empirical Approaches to Multilingual Lexical Acquisitiontbaldwin/lexacq/lecture07.pdf · Empirical Approaches to Multilingual Lexical Acquisition Lecture 7 (18/7/2008) Basic Method

Empirical Approaches to Multilingual Lexical Acquisition Lecture 7 (18/7/2008)

Example English Alternations

(1) Kim loaded the truck with hay Spray/loadKim loaded hay on the truck

(2) Kim sold the car to Sandy DativeKim sold Sandy the car

(3) The dog walks CausativeKim walks the dog

(4) Kim sliced the meat MiddleThe meat sliced easily

Levin (1993) 17

Page 19: Empirical Approaches to Multilingual Lexical Acquisitiontbaldwin/lexacq/lecture07.pdf · Empirical Approaches to Multilingual Lexical Acquisition Lecture 7 (18/7/2008) Basic Method

Empirical Approaches to Multilingual Lexical Acquisition Lecture 7 (18/7/2008)

Example Japanese Alternations

(5) Kim-gaKim-nom

doa-odoor-acc

akeruopens

/ doa-gadoor-nom

akuopens

‘Kim opens the door’ ‘The door opens’

(6) Kim-gaKim-nom

doa-odoor-acc

hirakuopens

/ doa-gadoor-nom

hirakuopens

‘Kim opens the door’ ‘The door opens’

(7) Kim-gaKim-nom

doa-odoor-acc

akeruopens

/ doa-gadoor-nom

ake-rareruopens-pass

‘Kim opens the door’ ‘The door is opened’

18

Page 20: Empirical Approaches to Multilingual Lexical Acquisitiontbaldwin/lexacq/lecture07.pdf · Empirical Approaches to Multilingual Lexical Acquisition Lecture 7 (18/7/2008) Basic Method

Empirical Approaches to Multilingual Lexical Acquisition Lecture 7 (18/7/2008)

Types of Alternations (1)

• Analytical/diathesis: alternation unmarked on the verb (e.g.

hiraku “opentrans” / hiraku “openintrans”)

• Lexical: alternation marked on the verb stem by predictable lexical

variation (e.g. akeru “opentrans” / aku “openintrans”)

• Synthetic: alternation marked by verbal inflection or a verb

morpheme (e.g. taberu “eat” / tabe-saseru “make eat”)

19

Page 21: Empirical Approaches to Multilingual Lexical Acquisitiontbaldwin/lexacq/lecture07.pdf · Empirical Approaches to Multilingual Lexical Acquisition Lecture 7 (18/7/2008) Basic Method

Empirical Approaches to Multilingual Lexical Acquisition Lecture 7 (18/7/2008)

Types of Alternations (2)

• Cognitive: distinct verb forms but regularised pattern of

alternation/simple change in focus, empathy, etc. (e.g. kau “buy”

/ uru “sell”)

• Focus on diathesis, lexical and synthetic in this research

20

Page 22: Empirical Approaches to Multilingual Lexical Acquisitiontbaldwin/lexacq/lecture07.pdf · Empirical Approaches to Multilingual Lexical Acquisition Lecture 7 (18/7/2008) Basic Method

Empirical Approaches to Multilingual Lexical Acquisition Lecture 7 (18/7/2008)

Alternations and Verb Semantics

• Verbs with similar alternation behaviour shown to cluster together

semantically

• Semantically-similar verbs shown to alternate similarly

21

Page 23: Empirical Approaches to Multilingual Lexical Acquisitiontbaldwin/lexacq/lecture07.pdf · Empirical Approaches to Multilingual Lexical Acquisition Lecture 7 (18/7/2008) Basic Method

Empirical Approaches to Multilingual Lexical Acquisition Lecture 7 (18/7/2008)

• Example: verbs of contact:

? conative alternation:

Kim punched the wall/Kim punched at the wall? body-part possessor alternation:

Kim hit Sandy’s finger/Kim hit Sandy on the finger? middle alternation:

Kim cut the bread/The bread cut easily? Verb classes:

Alternation Touch Hit Cut Break

conative N Y Y N

body-part poss Y Y Y N

middle N N Y Y

22

Page 24: Empirical Approaches to Multilingual Lexical Acquisitiontbaldwin/lexacq/lecture07.pdf · Empirical Approaches to Multilingual Lexical Acquisition Lecture 7 (18/7/2008) Basic Method

Empirical Approaches to Multilingual Lexical Acquisition Lecture 7 (18/7/2008)

Alternation-based LexiconReconstruction

(Baldwin and Bond 2002)

23

Page 25: Empirical Approaches to Multilingual Lexical Acquisitiontbaldwin/lexacq/lecture07.pdf · Empirical Approaches to Multilingual Lexical Acquisition Lecture 7 (18/7/2008) Basic Method

Empirical Approaches to Multilingual Lexical Acquisition Lecture 7 (18/7/2008)

Basic Method

• Use selectional preferences to automatically extract alternations

from a Japanese-English valency dictionary

• Underlying hypothesis: selectional preferences on alternating

slots are the same

• Focus on Japanese verbs

• Analyse both the success of the method and what alternations we

unearth

Baldwin and Bond (2002) 24

Page 26: Empirical Approaches to Multilingual Lexical Acquisitiontbaldwin/lexacq/lecture07.pdf · Empirical Approaches to Multilingual Lexical Acquisition Lecture 7 (18/7/2008) Basic Method

Empirical Approaches to Multilingual Lexical Acquisition Lecture 7 (18/7/2008)

The Bigger Picture

• Move from a flat Japanese–English transfer dictionary to a

hierarchical, language-modular dictionary structure

• In each monolingual lexicon, maximise structure sharing through

analysis of alternations

• Assume no pre-defined alternation set (cf. Levin (1993)), no

supervision in alternation extraction

Baldwin and Bond (2002) 25

Page 27: Empirical Approaches to Multilingual Lexical Acquisitiontbaldwin/lexacq/lecture07.pdf · Empirical Approaches to Multilingual Lexical Acquisition Lecture 7 (18/7/2008) Basic Method

Empirical Approaches to Multilingual Lexical Acquisition Lecture 7 (18/7/2008)

Baldwin and Bond (2002) 26

Page 28: Empirical Approaches to Multilingual Lexical Acquisitiontbaldwin/lexacq/lecture07.pdf · Empirical Approaches to Multilingual Lexical Acquisition Lecture 7 (18/7/2008) Basic Method

Empirical Approaches to Multilingual Lexical Acquisition Lecture 7 (18/7/2008)

B C

SENSE i

...

VERB: atsumeru

A B C

SENSE k

VERB: gather

A

B C

SENSE j

...

VERB:

A

...SENSE l

VERB: recruit

...

BA

shuketsu-suru

AGENTEVIDENCE

CONC_THING

PLACELOCATIONAGENT

AGENTEVIDENCE

CONC_THING

PLACELOCATIONAGENT

AGENTANIMAL

INANIMATE

PLACELOCATIONAGENT

PERSONORGANIZATION

BASE

ARGS:

ALT 1

BASE

ARGS:

ALT 1

BASE

ARGS:

ALT 1

ALT 2

BASE

ARGS:

A-ga B-o C-ni/e atsumeru A gathers B in C

A recruits B

[-locative] [-locative]

[caus_inch]

[caus_inch]

A-ga B-o atsumeru

A-ga B-o C-ni/e shuketsu-suru

B-ga C-ni/e shuketsu-suru

A gathers B

B gathers in C

Baldwin and Bond (2002) 27

Page 29: Empirical Approaches to Multilingual Lexical Acquisitiontbaldwin/lexacq/lecture07.pdf · Empirical Approaches to Multilingual Lexical Acquisition Lecture 7 (18/7/2008) Basic Method

Empirical Approaches to Multilingual Lexical Acquisition Lecture 7 (18/7/2008)

Source Dictionary

• Goi-Taikei Japanese–English valency dictionary ¶

• Valency frame described in form of case frame headed by verb

• Each case slot annotated with:

? set of prototypical case markers

? POS (NP or S)? set of selectional restrictions (→ Goi-Taikei thesaurus)

? set of lexical fillers

Baldwin and Bond (2002) 28

Page 30: Empirical Approaches to Multilingual Lexical Acquisitiontbaldwin/lexacq/lecture07.pdf · Empirical Approaches to Multilingual Lexical Acquisition Lecture 7 (18/7/2008) Basic Method

Empirical Approaches to Multilingual Lexical Acquisition Lecture 7 (18/7/2008)

Constraints on Alternations

1. The selectional restrictions and lexical fillers on matching case slots

are preserved under alternation

2. Alternations are monotonic in valency terms

3. A given alternation type has fixed direction: assume valency

decreasing, and normalise direction alphabetically for valency-

maintaining alternations (over-constraint ¶)

Baldwin and Bond (2002) 29

Page 31: Empirical Approaches to Multilingual Lexical Acquisitiontbaldwin/lexacq/lecture07.pdf · Empirical Approaches to Multilingual Lexical Acquisition Lecture 7 (18/7/2008) Basic Method

Empirical Approaches to Multilingual Lexical Acquisition Lecture 7 (18/7/2008)

Extraction Procedure

1. Generate all legal alternation candidates for each case frame pairing

(S, T ) where S and T share some common kanji prefix

2. Score each, and return the highest scoring from among them

3. Accept only non-negatively-scoring alternations

4. In case of tie, select that alternation that preserves case marking

the most

Baldwin and Bond (2002) 30

Page 32: Empirical Approaches to Multilingual Lexical Acquisitiontbaldwin/lexacq/lecture07.pdf · Empirical Approaches to Multilingual Lexical Acquisition Lecture 7 (18/7/2008) Basic Method

Empirical Approaches to Multilingual Lexical Acquisition Lecture 7 (18/7/2008)

Scoring Alternations

• Score linked case slots S and T according to their relative

conceptual cohesion:

cohesion(nq) = − log P (nq) = − log

∑lexp,i∈nqfreq(lexp,i)∑lexp,i∈n0freq(lexp,i)

classmatch(nj, nk) = 3 cohesion(sub(nj, nk))

−cohesion(nj) − cohesion(nk)

• Sum up the individual scores

Baldwin and Bond (2002) 31

Page 33: Empirical Approaches to Multilingual Lexical Acquisitiontbaldwin/lexacq/lecture07.pdf · Empirical Approaches to Multilingual Lexical Acquisition Lecture 7 (18/7/2008) Basic Method

Empirical Approaches to Multilingual Lexical Acquisition Lecture 7 (18/7/2008)

classmatch(a, a) = 3 × 5.0 − 5.0 − 5.0 = 5.0

classmatch(a, c) = 3 × 0.9 − 0.9 − 5.0 = −3.2

classmatch(a, b) = 3 × 0 − 1.0 − 5.0 = −6.0

0

1.0 0.9

9.2

1.2 1.9 7.2

9.3

4.55.05.0

4.6

12.3

5.2

3.0

9.57.3

7.07.6

5.3

a

b c

Baldwin and Bond (2002) 32

Page 34: Empirical Approaches to Multilingual Lexical Acquisitiontbaldwin/lexacq/lecture07.pdf · Empirical Approaches to Multilingual Lexical Acquisition Lecture 7 (18/7/2008) Basic Method

Empirical Approaches to Multilingual Lexical Acquisition Lecture 7 (18/7/2008)

Top 10 Extracted AlternationsIndex Case slot mapping

1 (NP1{ga}→φ) (NP2{o}→{ga})2 (NP1{ga}) (NP2{o}→φ)3 (NP1{ga}→φ) (NP2{o}→{ga}) (NP3{ni})4 (NP1{ga}→φ) (NP2{o}→{ga}) (NP3{ni, e})5 (NP1{ga}) (NP2{o}→φ) (NP3{ni}→{o})6 (NP1{ga}) (NP2{o}) (NP3{ni}→φ)7 (NP1{ga}) (NP2{o}→{kara, yori})8 (NP1{ga}→φ) (NP2{o}→{ga}) (NP3{to, ni})9 (NP1{ga}) (NP2{ni}→{o})10 (NP1{ga}→φ) (NP2{o}→{ni}) (NP3{de}→{o})

Baldwin and Bond (2002) 33

Page 35: Empirical Approaches to Multilingual Lexical Acquisitiontbaldwin/lexacq/lecture07.pdf · Empirical Approaches to Multilingual Lexical Acquisition Lecture 7 (18/7/2008) Basic Method

Empirical Approaches to Multilingual Lexical Acquisition Lecture 7 (18/7/2008)

Reflections

• Proposed method shown to be effective in extracting out valid

alternations

• Little sense of recall (although not necessarily important for the

dictionary reconstruction process)

• Possibility for using translation information to improve the accuracy

of the extraction method

34

Page 36: Empirical Approaches to Multilingual Lexical Acquisitiontbaldwin/lexacq/lecture07.pdf · Empirical Approaches to Multilingual Lexical Acquisition Lecture 7 (18/7/2008) Basic Method

Empirical Approaches to Multilingual Lexical Acquisition Lecture 7 (18/7/2008)

A General Feature Space forAutomatic Verb Classification

(Joanis and Stevenson 2003)

35

Page 37: Empirical Approaches to Multilingual Lexical Acquisitiontbaldwin/lexacq/lecture07.pdf · Empirical Approaches to Multilingual Lexical Acquisition Lecture 7 (18/7/2008) Basic Method

Empirical Approaches to Multilingual Lexical Acquisition Lecture 7 (18/7/2008)

Basic Method

• Use alternations and general verbal features to classify verbs

according to Levin (1993) classes

• Dodge the issue of alternation detection or subcat acquisition by

relying on features which capture alternation effects only indirectly

• Supplement alternation-based features with various weak lexical

semantic indicators

Joanis and Stevenson (2003) 36

Page 38: Empirical Approaches to Multilingual Lexical Acquisitiontbaldwin/lexacq/lecture07.pdf · Empirical Approaches to Multilingual Lexical Acquisition Lecture 7 (18/7/2008) Basic Method

Empirical Approaches to Multilingual Lexical Acquisition Lecture 7 (18/7/2008)

Syntactic Slot-based Features

• Frequency of different syntactic slots occurring with a verb (includes

PPs, conditioned on P)

• Degree of lexical overlap between syntactic slots known to alternate

• Expletive pronouns/there

Joanis and Stevenson (2003) 37

Page 39: Empirical Approaches to Multilingual Lexical Acquisitiontbaldwin/lexacq/lecture07.pdf · Empirical Approaches to Multilingual Lexical Acquisition Lecture 7 (18/7/2008) Basic Method

Empirical Approaches to Multilingual Lexical Acquisition Lecture 7 (18/7/2008)

Tense, Voice and Aspect Features

• Relative frequency of passivisation

• POS (tense) of the verb

• Relative occurrence with modals/adverbials

• Relative occurrence in derived forms

Joanis and Stevenson (2003) 38

Page 40: Empirical Approaches to Multilingual Lexical Acquisitiontbaldwin/lexacq/lecture07.pdf · Empirical Approaches to Multilingual Lexical Acquisition Lecture 7 (18/7/2008) Basic Method

Empirical Approaches to Multilingual Lexical Acquisition Lecture 7 (18/7/2008)

Animacy Feature

• Relative occurrence of animate fillers (personal pronouns, person

names) in each of the syntactic slots

Joanis and Stevenson (2003) 39

Page 41: Empirical Approaches to Multilingual Lexical Acquisitiontbaldwin/lexacq/lecture07.pdf · Empirical Approaches to Multilingual Lexical Acquisition Lecture 7 (18/7/2008) Basic Method

Empirical Approaches to Multilingual Lexical Acquisition Lecture 7 (18/7/2008)

Task

• 2/3-way classification of a range of verb classes:

? benefactive vs. recipient verbs

? admire vs. amuse verbs

? run vs. sound emission verbs

? cheat vs. cheat/steal verbs

? wipe vs. cheat/steal verbs

? spray/load vs. fill vs. other put verbs

? run vs. change of state vs. object drop verbs

• Also combined multi-way tasks

Joanis and Stevenson (2003) 40

Page 42: Empirical Approaches to Multilingual Lexical Acquisitiontbaldwin/lexacq/lecture07.pdf · Empirical Approaches to Multilingual Lexical Acquisition Lecture 7 (18/7/2008) Basic Method

Empirical Approaches to Multilingual Lexical Acquisition Lecture 7 (18/7/2008)

Experiments

• Feature values extracted from BNC (parsed with SCOL)

• Focus on verbs which occur > 100 times in the BNC in only one of

the classes under consideration (with the predominant sense), and

which are not excessively polysemous

• C5.0 used as learner (decision tree-based)

• Varied results were obtained

Joanis and Stevenson (2003) 41

Page 43: Empirical Approaches to Multilingual Lexical Acquisitiontbaldwin/lexacq/lecture07.pdf · Empirical Approaches to Multilingual Lexical Acquisition Lecture 7 (18/7/2008) Basic Method

Empirical Approaches to Multilingual Lexical Acquisition Lecture 7 (18/7/2008)

Reflections

• General technique proposed for verbal classification, based partly

on alternation behaviour

• Little sense of what works well for what class, or, e.g., whether

selectional preferences aid the classifier

• Potential for improvement through subcat frame acquisition

(remove independence of syntactic slots), explicit modelling of

selectional preferences and a better parser

42

Page 44: Empirical Approaches to Multilingual Lexical Acquisitiontbaldwin/lexacq/lecture07.pdf · Empirical Approaches to Multilingual Lexical Acquisition Lecture 7 (18/7/2008) Basic Method

Empirical Approaches to Multilingual Lexical Acquisition Lecture 7 (18/7/2008)

Decision Tree Learning

43

Page 45: Empirical Approaches to Multilingual Lexical Acquisitiontbaldwin/lexacq/lecture07.pdf · Empirical Approaches to Multilingual Lexical Acquisition Lecture 7 (18/7/2008) Basic Method

Empirical Approaches to Multilingual Lexical Acquisition Lecture 7 (18/7/2008)

Constructing Decision Trees: ID3

• Basic method: construct decision trees in recursive divide-and-

conquer fashion

FUNCTION ID3 (Root)

IF all instances at root have same class

THEN stop

ELSE Select a new attribute to use in partitioning root nodeinstances

Create a branch for each attribute value and partition uproot node instances according to each value

Call ID3(LEAFi) for each leaf node LEAFi

• Note: we may end up with non-pure leaves

44

Page 46: Empirical Approaches to Multilingual Lexical Acquisitiontbaldwin/lexacq/lecture07.pdf · Empirical Approaches to Multilingual Lexical Acquisition Lecture 7 (18/7/2008) Basic Method

Empirical Approaches to Multilingual Lexical Acquisition Lecture 7 (18/7/2008)

YES 9

NO 5

o’castYES 2

NO 3

YES 4

NO 0

YES 3

NO 2

sunny rainy

outlook

high

YES 0

NO 3

norm

humidity

YES 2

NO 0

true

YES 0

NO 2

false

windy

YES 3

NO 0

a,b,c,d,e,f,g,

h,i,j,k,l,m,n{

{

{d,e,f,j,n

{

{a,b,h,i,k

{

{c,g,l,m

{

{d,e,j

{

{f,n

{

{i,k

{

{a,b,h

{

45

Page 47: Empirical Approaches to Multilingual Lexical Acquisitiontbaldwin/lexacq/lecture07.pdf · Empirical Approaches to Multilingual Lexical Acquisition Lecture 7 (18/7/2008) Basic Method

Empirical Approaches to Multilingual Lexical Acquisition Lecture 7 (18/7/2008)

Classifying Novel Instances

• Having constructed the decision tree, we classify novel instances by

traversing down the tree and classifying according to the majority

class at the deepest reachable point in the tree structure

• Complications:

? unobserved attribute–value pairs

? missing values

46

Page 48: Empirical Approaches to Multilingual Lexical Acquisitiontbaldwin/lexacq/lecture07.pdf · Empirical Approaches to Multilingual Lexical Acquisition Lecture 7 (18/7/2008) Basic Method

Empirical Approaches to Multilingual Lexical Acquisition Lecture 7 (18/7/2008)

YES 9

NO 5

o’castYES 2

NO 3

YES 4

NO 0

YES 3

NO 2

sunny rainy

outlook

high

YES 0

NO 3

norm

humidity

YES 2

NO 0

true

YES 0

NO 2

false

windy

YES 3

NO 0

(sunny,hot,normal,FALSE)(rainy,hot,low,FALSE)(?,cool,high,TRUE)

TEST DATA

47

Page 49: Empirical Approaches to Multilingual Lexical Acquisitiontbaldwin/lexacq/lecture07.pdf · Empirical Approaches to Multilingual Lexical Acquisition Lecture 7 (18/7/2008) Basic Method

Empirical Approaches to Multilingual Lexical Acquisition Lecture 7 (18/7/2008)

Criterion for Attribute Selection

• Which is the best attribute?

? want to get the smallest tree (Occam’s Razor; generalisability)

• Heuristic: choose the attribute that produces the “purest” nodes

according to information gain (IG)

information gain increases with the average purity of the subsets

• Strategy: choose the attribute that gives the greatest information

gain

• NB standard vs. oblivious decision trees

48

Page 50: Empirical Approaches to Multilingual Lexical Acquisitiontbaldwin/lexacq/lecture07.pdf · Empirical Approaches to Multilingual Lexical Acquisition Lecture 7 (18/7/2008) Basic Method

Empirical Approaches to Multilingual Lexical Acquisition Lecture 7 (18/7/2008)

Mean Information Associated with aDecision Stump

• We calculate the mean information for a tree stump with m

attributes as:

H(x1, .., xm) =∑

i=1mP (xi)H(xi)

where H(xi) is the entropy at node xi

49

Page 51: Empirical Approaches to Multilingual Lexical Acquisitiontbaldwin/lexacq/lecture07.pdf · Empirical Approaches to Multilingual Lexical Acquisition Lecture 7 (18/7/2008) Basic Method

Empirical Approaches to Multilingual Lexical Acquisition Lecture 7 (18/7/2008)

Full weather.nominal DatasetOutlook Temperature Humidity Windy Play

a: sunny hot high FALSE nob: sunny hot high TRUE noc: overcast hot high FALSE yesd: rainy mild high FALSE yese: rainy cool normal FALSE yesf: rainy cool normal TRUE nog: overcast cool normal TRUE yesh: sunny mild high FALSE noi: sunny cool normal FALSE yesj: rainy mild normal FALSE yesk: sunny mild normal TRUE yesl: overcast mild high TRUE yesm: overcast hot normal FALSE yesn: rainy mild high TRUE no

50

Page 52: Empirical Approaches to Multilingual Lexical Acquisitiontbaldwin/lexacq/lecture07.pdf · Empirical Approaches to Multilingual Lexical Acquisition Lecture 7 (18/7/2008) Basic Method

Empirical Approaches to Multilingual Lexical Acquisition Lecture 7 (18/7/2008)

Mean Information (outlook)

YES 9

NO 5

YES 2

NO 3

YES 4

NO 0

YES 3

NO 2

outlook

sunny o’cast rainy

mean info = .693

.971 .971

0

.940

51

Page 53: Empirical Approaches to Multilingual Lexical Acquisitiontbaldwin/lexacq/lecture07.pdf · Empirical Approaches to Multilingual Lexical Acquisition Lecture 7 (18/7/2008) Basic Method

Empirical Approaches to Multilingual Lexical Acquisition Lecture 7 (18/7/2008)

Mean Information (temperature)

YES 9

NO 5

YES 2

NO 2

YES 4

NO 2

YES 3

NO 1

temperature

hot mild cool

mean info = .911

1.00 .811

.918

.940

52

Page 54: Empirical Approaches to Multilingual Lexical Acquisitiontbaldwin/lexacq/lecture07.pdf · Empirical Approaches to Multilingual Lexical Acquisition Lecture 7 (18/7/2008) Basic Method

Empirical Approaches to Multilingual Lexical Acquisition Lecture 7 (18/7/2008)

Mean Information (humidity)

YES 9

NO 5

high

YES 3

NO 4

norm

humidity

YES 6

NO 1

mean info = .787

.982 .592

.940

53

Page 55: Empirical Approaches to Multilingual Lexical Acquisitiontbaldwin/lexacq/lecture07.pdf · Empirical Approaches to Multilingual Lexical Acquisition Lecture 7 (18/7/2008) Basic Method

Empirical Approaches to Multilingual Lexical Acquisition Lecture 7 (18/7/2008)

Mean Information (windy)

YES 9

NO 5

true

YES 6

NO 2

false

windy

YES 3

NO 3

mean info = .892

.811 1.00

.940

54

Page 56: Empirical Approaches to Multilingual Lexical Acquisitiontbaldwin/lexacq/lecture07.pdf · Empirical Approaches to Multilingual Lexical Acquisition Lecture 7 (18/7/2008) Basic Method

Empirical Approaches to Multilingual Lexical Acquisition Lecture 7 (18/7/2008)

Attribute Selection: Information Gain

• We determine which attribute RA (with values x1, ...xm) best

partitions the instances at a given root node R according to

information gain:

IG(RA|R) = H(R) −∑

i=1mP (xi)H(xi)

IG(outlook|R) = 0.247

IG(temperature|R) = 0.029

IG(humidity|R) = 0.152

IG(windy|R) = 0.048

55

Page 57: Empirical Approaches to Multilingual Lexical Acquisitiontbaldwin/lexacq/lecture07.pdf · Empirical Approaches to Multilingual Lexical Acquisition Lecture 7 (18/7/2008) Basic Method

Empirical Approaches to Multilingual Lexical Acquisition Lecture 7 (18/7/2008)

ReferencesBaldwin, Timothy, and Francis Bond. 2002. Alternation-based lexicon reconstruction. In

Proc. of the 9th International Conference on Theoretical and Methodological Issues in

Machine Translation (TMI 2002), 1–11, Keihanna, Japan.

Brent, Michael R. 1993. From grammar to lexicon: Unsupervised learning of lexical syntax.

Computational Linguistics 19.243–62.

Joanis, Eric, and Suzanne Stevenson. 2003. A general feature space for automatic verb

classification. In Proc. of the 10th Conference of the EACL (EACL 2003), 163–70, Budapest,

Hungary.

Levin, Beth. 1993. English Verb Classes and Alterations. Chicago, USA: University of Chicago

Press.

Manning, Christopher D. 1993. Automatic acquisition of a large subcategorization dictionary

from corpora. In Proc. of the 31st Annual Meeting of the ACL, 235–42.

56