Semantic Role Labeling for Arabic using Kernel Methods Mona Diab Alessandro Moschitti Daniele...

54
Semantic Role Labeling for Arabic using Kernel Methods Mona Diab Alessandro Moschitti Daniele Pighin

Transcript of Semantic Role Labeling for Arabic using Kernel Methods Mona Diab Alessandro Moschitti Daniele...

Page 1: Semantic Role Labeling for Arabic using Kernel Methods Mona Diab Alessandro Moschitti Daniele Pighin.

Semantic Role Labeling for Arabic using Kernel Methods

Mona DiabAlessandro Moschitti

Daniele Pighin

Page 2: Semantic Role Labeling for Arabic using Kernel Methods Mona Diab Alessandro Moschitti Daniele Pighin.

What is SRL?

Proposition

John opened the door

Page 3: Semantic Role Labeling for Arabic using Kernel Methods Mona Diab Alessandro Moschitti Daniele Pighin.

What is SRL?

Proposition

[John]Agent [opened]Predicate [the door]Theme

Page 4: Semantic Role Labeling for Arabic using Kernel Methods Mona Diab Alessandro Moschitti Daniele Pighin.

What is SRL?

Proposition

[John]Agent [opened]Predicate [the door]Theme

Subject Object

Page 5: Semantic Role Labeling for Arabic using Kernel Methods Mona Diab Alessandro Moschitti Daniele Pighin.

What is SRL?

Proposition

[John]Agent [opened]Predicate [the door]Theme

Subject Object

[The door]Theme [opened]Predicate

Page 6: Semantic Role Labeling for Arabic using Kernel Methods Mona Diab Alessandro Moschitti Daniele Pighin.

What is SRL?

Proposition

[John]Agent [opened]Predicate [the door]Theme

Object

Subject[The door]Theme [opened]Predicate

Page 7: Semantic Role Labeling for Arabic using Kernel Methods Mona Diab Alessandro Moschitti Daniele Pighin.

What is SRL?

Proposition

[John]Agent [opened]Predicate [the door]Theme

FrameNet Agent Container_portal

[The door]Theme [opened]Predicate

Page 8: Semantic Role Labeling for Arabic using Kernel Methods Mona Diab Alessandro Moschitti Daniele Pighin.

What is SRL?

Proposition

[John]Agent [opened]Predicate [the door]Theme

PropBank ARG0 ARG1

[The door]Theme [opened]Predicate

Page 9: Semantic Role Labeling for Arabic using Kernel Methods Mona Diab Alessandro Moschitti Daniele Pighin.

Why SRL?

• Useful for information extraction

• Useful for Question Answering

• Useful for Machine Translation?

Page 10: Semantic Role Labeling for Arabic using Kernel Methods Mona Diab Alessandro Moschitti Daniele Pighin.

Our Goal

Last Sunday India to official visit Rongji Zhu the-Chinese the-Ministers president started

The Chinese Prime Minister Zho Rongji started an official visit to India last sunday

Page 11: Semantic Role Labeling for Arabic using Kernel Methods Mona Diab Alessandro Moschitti Daniele Pighin.

Our Goal

Last Sunday India to official visit Rongji Zhu the-Chinese the-Ministers president started

The Chinese Prime Minister Zho Rongji started an official visit to India last Sunday

ARGM-TMP

Page 12: Semantic Role Labeling for Arabic using Kernel Methods Mona Diab Alessandro Moschitti Daniele Pighin.

RoadMap

• Arabic Characteristics

• Our Approach

• Experiments & Results

• Conclusions & Future Directions

Page 13: Semantic Role Labeling for Arabic using Kernel Methods Mona Diab Alessandro Moschitti Daniele Pighin.

Morphology

• Rich complex morphology– Templatic, concatenative, derivational,

inflectional• wbHsnAthm• w+b+Hsn+At+hm• and by virtue(s) their

– Verbs are marked for tense, person, gender, aspect, mood, voice

– Nominals are marked for case, number, gender, definiteness

• Orthography is underspecified for short vowels and consonant doubling (diacritics)

Page 14: Semantic Role Labeling for Arabic using Kernel Methods Mona Diab Alessandro Moschitti Daniele Pighin.

Syntax

Page 15: Semantic Role Labeling for Arabic using Kernel Methods Mona Diab Alessandro Moschitti Daniele Pighin.

Characteristics relevant for SRL

• Typical underspecification of short vowels masks morphological features such as case and agreement– Example:

rjl Albyt AlkbyrMan_masc the-house_masc the-big_masc

“the big man of the house” or “the man of the big house”

Page 16: Semantic Role Labeling for Arabic using Kernel Methods Mona Diab Alessandro Moschitti Daniele Pighin.

Characteristics relevant for SRL

• Typical underspecification of short vowels masks morphological features such as case and agreement– Example:

rjlu Albyti AlkbyriMan_masc-Nom the-house_masc-Gen the-big_masc-Gen

the man of the big house

Page 17: Semantic Role Labeling for Arabic using Kernel Methods Mona Diab Alessandro Moschitti Daniele Pighin.

Characteristics relevant for SRL

• Typical underspecification of short vowels masks morphological features such as case and agreement– Example:

rjlu Albyti AlkbyruMan_masc-Nom the-house_masc-Gen the-big_masc-Nom

the big man of the house

Page 18: Semantic Role Labeling for Arabic using Kernel Methods Mona Diab Alessandro Moschitti Daniele Pighin.

Characteristics relevant for SRL

• Idafa constructions make indefinite nominals syntactically definite hence allowing for agreement, therefore better scoping– Example:

[rjlu Albyti] AlkbyruMan_masc-Nom-Def the-house_masc-Gen the-big_masc-Nom-Def

the big man of the house

Page 19: Semantic Role Labeling for Arabic using Kernel Methods Mona Diab Alessandro Moschitti Daniele Pighin.

Characteristics relevant for SRL

Page 20: Semantic Role Labeling for Arabic using Kernel Methods Mona Diab Alessandro Moschitti Daniele Pighin.

Characteristics relevant for SRL

Page 21: Semantic Role Labeling for Arabic using Kernel Methods Mona Diab Alessandro Moschitti Daniele Pighin.

Characteristics relevant for SRL

Page 22: Semantic Role Labeling for Arabic using Kernel Methods Mona Diab Alessandro Moschitti Daniele Pighin.

Characteristics relevant for SRL

Page 23: Semantic Role Labeling for Arabic using Kernel Methods Mona Diab Alessandro Moschitti Daniele Pighin.

Characteristics relevant for SRL

• Passive constructions differ from English in that they can not have an explicit non-instrument underlying subject, hence only ARG1 and ARG2. ARG0 are not allowed.

– Example:qutil Emru bslAHiK qAtliK*qutl [Emru]ARG1 [bslmY]ARG0

*[Amr]ARG1 was killed [by SalmA]ARG0

Page 24: Semantic Role Labeling for Arabic using Kernel Methods Mona Diab Alessandro Moschitti Daniele Pighin.

Characteristics relevant for SRL

• Passive constructions differ from English in that they can not have an explicit non-instrument underlying subject, hence only ARG1 and ARG2. ARG0 are not allowed.

– Example:qutil [Emru]ARG1 [bslAHiK qAtliK]ARG2

[Amr]ARG1 was killed [by a deadly weapon]ARG2

Page 25: Semantic Role Labeling for Arabic using Kernel Methods Mona Diab Alessandro Moschitti Daniele Pighin.

Characteristics relevant for SRL

Page 26: Semantic Role Labeling for Arabic using Kernel Methods Mona Diab Alessandro Moschitti Daniele Pighin.

Our Approach

Page 27: Semantic Role Labeling for Arabic using Kernel Methods Mona Diab Alessandro Moschitti Daniele Pighin.

Semantic Role Labeling Steps

• Given a sentence and an associated syntactic parse

• An SRL system identifies the arguments for a given predicate

• The arguments are identified in two steps– Argument boundary detection– Argument role classification

• For the overall system we apply a heuristic for argument label conflict resolution

• one label per argument

Page 28: Semantic Role Labeling for Arabic using Kernel Methods Mona Diab Alessandro Moschitti Daniele Pighin.

The Sentence

The Chinese Prime Minister Zho Rongji started an official visit to India last sunday

Page 29: Semantic Role Labeling for Arabic using Kernel Methods Mona Diab Alessandro Moschitti Daniele Pighin.

The Parse Tree

Page 30: Semantic Role Labeling for Arabic using Kernel Methods Mona Diab Alessandro Moschitti Daniele Pighin.

Boundary Identification

Page 31: Semantic Role Labeling for Arabic using Kernel Methods Mona Diab Alessandro Moschitti Daniele Pighin.

Role Classification

Page 32: Semantic Role Labeling for Arabic using Kernel Methods Mona Diab Alessandro Moschitti Daniele Pighin.

Our Approach

• Experiment with different kernels

• Experiment with Standard Features (similar to English) and rich morphological features specific to Arabic

Page 33: Semantic Role Labeling for Arabic using Kernel Methods Mona Diab Alessandro Moschitti Daniele Pighin.

Different Kernels• Polynomial Kernels (1-6) with standard

features • Tree Kernels

Where Nt1 and Nt2 are the sets of nodes in t1 and t2, and Δ(.) evaluates the common substructures rooted in n1 and n2

Page 34: Semantic Role Labeling for Arabic using Kernel Methods Mona Diab Alessandro Moschitti Daniele Pighin.

Argument Structure Trees (AST)

NP

D N

VP

V

delivers

a talk

S

N

Paul

in

PP

IN NP

jj

formal

N

styleArg. 1

Defined as the minimal subtree encompassing the predicate and one of its arguments

Page 35: Semantic Role Labeling for Arabic using Kernel Methods Mona Diab Alessandro Moschitti Daniele Pighin.

Tree Substructure Representations

NP

D N

VP

V

delivers

a talk

NP

D N

VP

V

delivers

a

NP

D N

VP

V

delivers

NP

D N

VP

V NP

VP

V

Page 36: Semantic Role Labeling for Arabic using Kernel Methods Mona Diab Alessandro Moschitti Daniele Pighin.

The overall set of AST substructures

NP

D N

a talk

NP

D N

NP

D N

a D N

a talk

NP

D N NP

D N

VP

V

delivers

a talk

V

delivers

NP

D N

VP

V

a talk

NP

D N

VP

V

NP

D N

VP

V

a

NP

D

VP

V

talk

N

a

NP

D N

VP

V

delivers

talk

NP

D N

VP

V

delivers NP

D N

VP

V

delivers

NP

VP

V NP

VP

V

delivers

talk

Page 37: Semantic Role Labeling for Arabic using Kernel Methods Mona Diab Alessandro Moschitti Daniele Pighin.

Explicit feature space

zxrr

..,0)..,0,..,1, .,1,.,1,..,0,. ..,0,..,0,..,1, ..,1,..,1,..,0, 0,(=xr

• counts the number of common substructures

NP

D N

a talk

NP

D N

a

NP

D N NP

D N

VP

V

delivers

a talk

NP

D N

VP

V

a talk

NP

D N

VP

V

talk

Page 38: Semantic Role Labeling for Arabic using Kernel Methods Mona Diab Alessandro Moschitti Daniele Pighin.

Standard Features• Predicate: Lemmatization of the predicate• Path: Syntactic path linking the predicate and an

argument NNNPVPVBD• Partial Path: Path feature limited to the branching of

arg• No Direction path without the traversals• Phrase type• Last and first POS of words in the arguments• Verb subcategorization frame: production expanding

the predicate parent node• Position of the argument relative to predicate• Syntactic Frame: positions of the surrounding NPs

relative to predicate

Page 39: Semantic Role Labeling for Arabic using Kernel Methods Mona Diab Alessandro Moschitti Daniele Pighin.

Extended Features for Arabic

Definiteness, Number, Gender, Case, Mood, Person, Lemma (vocalized), English Gloss, Unvocalized surface

form, Vocalized Surface form

• Expanded the leaf nodes in AST with 10 attribute value pairs creating EAST

Page 40: Semantic Role Labeling for Arabic using Kernel Methods Mona Diab Alessandro Moschitti Daniele Pighin.

Arabic AST

Sample AST from our example

ARG0

Page 41: Semantic Role Labeling for Arabic using Kernel Methods Mona Diab Alessandro Moschitti Daniele Pighin.

Arabic AST

Sample AST from our example

ARG0

Page 42: Semantic Role Labeling for Arabic using Kernel Methods Mona Diab Alessandro Moschitti Daniele Pighin.

Extended AST (EAST)

……

Page 43: Semantic Role Labeling for Arabic using Kernel Methods Mona Diab Alessandro Moschitti Daniele Pighin.

Experiments & Results

Page 44: Semantic Role Labeling for Arabic using Kernel Methods Mona Diab Alessandro Moschitti Daniele Pighin.

Experimental Set Up

• SemEval 2007 Task 18 data set, Pilot Arabic Propbank

• 95 most frequent verbs in ATB3v2• Gold parses, Unvowelized, Bies

reduced POS tag set (25 tags)• Num Sentences: Dev (886), Test (902),

Train (8402)• 26 role types (5 numbered ARGs)

Page 45: Semantic Role Labeling for Arabic using Kernel Methods Mona Diab Alessandro Moschitti Daniele Pighin.

Experimental Set Up

• Experimented only with 350k examples

• We use the SVM-Light TK Toolkit (Moschitti, 2004, 2006) with SVM light default parameters

• Evaluation metrics of precision, recall and F measure are obtained using the CoNLL evaluator

Page 46: Semantic Role Labeling for Arabic using Kernel Methods Mona Diab Alessandro Moschitti Daniele Pighin.

Boundary Detection Results

Page 47: Semantic Role Labeling for Arabic using Kernel Methods Mona Diab Alessandro Moschitti Daniele Pighin.

Role Classification Results

Page 48: Semantic Role Labeling for Arabic using Kernel Methods Mona Diab Alessandro Moschitti Daniele Pighin.

Overall Results

Page 49: Semantic Role Labeling for Arabic using Kernel Methods Mona Diab Alessandro Moschitti Daniele Pighin.

Observations-BD

• AST and EAST don’t differ much for boundary detection

• AST+EAST+ Poly (3) gives best BD results

• AST and EAST perform significantly better than Poly (1)

Page 50: Semantic Role Labeling for Arabic using Kernel Methods Mona Diab Alessandro Moschitti Daniele Pighin.

Observations – RC & SRL

Page 51: Semantic Role Labeling for Arabic using Kernel Methods Mona Diab Alessandro Moschitti Daniele Pighin.

Conclusions

• Explicitly encoding the rich morphological features helps with SRL in Arabic

• Tree Kernels is indeed a feasible way of dealing with large feature spaces that are structural in nature

• Combining kernels yields better results

Page 52: Semantic Role Labeling for Arabic using Kernel Methods Mona Diab Alessandro Moschitti Daniele Pighin.

Future Directions

Page 53: Semantic Role Labeling for Arabic using Kernel Methods Mona Diab Alessandro Moschitti Daniele Pighin.

Thank You

Page 54: Semantic Role Labeling for Arabic using Kernel Methods Mona Diab Alessandro Moschitti Daniele Pighin.

The parse tree