A Transition-Based Directed Acyclic Graph Parser for Universal … · 2020. 9. 11. · 2 TUPA —...

83
1 A Transition-Based Directed Acyclic Graph Parser for Universal Conceptual Cognitive Annotation Daniel Hershcovich, Omri Abend and Ari Rappoport Tel Aviv University January 9, 2018

Transcript of A Transition-Based Directed Acyclic Graph Parser for Universal … · 2020. 9. 11. · 2 TUPA —...

Page 1: A Transition-Based Directed Acyclic Graph Parser for Universal … · 2020. 9. 11. · 2 TUPA — Transition-based UCCA Parser The first parser to support the combination of three

1

A Transition-Based Directed Acyclic Graph Parserfor Universal Conceptual Cognitive Annotation

Daniel Hershcovich, Omri Abend and Ari Rappoport

Tel Aviv UniversityJanuary 9, 2018

Page 2: A Transition-Based Directed Acyclic Graph Parser for Universal … · 2020. 9. 11. · 2 TUPA — Transition-based UCCA Parser The first parser to support the combination of three

2

TUPA — Transition-based UCCA ParserThe first parser to support the combination of three properties:

1. Non-terminal nodes — entities and events over the text

2. Reentrancy — allow argument sharing3. Discontinuity — conceptual units are split

— needed for many semantic schemes (e.g. AMR, UCCA).

You want

to

take a long bath

Page 3: A Transition-Based Directed Acyclic Graph Parser for Universal … · 2020. 9. 11. · 2 TUPA — Transition-based UCCA Parser The first parser to support the combination of three

3

TUPA — Transition-based UCCA ParserThe first parser to support the combination of three properties:

1. Non-terminal nodes — entities and events over the text2. Reentrancy — allow argument sharing

3. Discontinuity — conceptual units are split— needed for many semantic schemes (e.g. AMR, UCCA).

You want

to

take a long bath

Page 4: A Transition-Based Directed Acyclic Graph Parser for Universal … · 2020. 9. 11. · 2 TUPA — Transition-based UCCA Parser The first parser to support the combination of three

4

TUPA — Transition-based UCCA ParserThe first parser to support the combination of three properties:

1. Non-terminal nodes — entities and events over the text2. Reentrancy — allow argument sharing3. Discontinuity — conceptual units are split

— needed for many semantic schemes (e.g. AMR, UCCA).

You want

to

take a long bath

Page 5: A Transition-Based Directed Acyclic Graph Parser for Universal … · 2020. 9. 11. · 2 TUPA — Transition-based UCCA Parser The first parser to support the combination of three

5

Introduction

Page 6: A Transition-Based Directed Acyclic Graph Parser for Universal … · 2020. 9. 11. · 2 TUPA — Transition-based UCCA Parser The first parser to support the combination of three

6

Linguistic Structure Annotation Schemes

• Syntactic dependencies

• Semantic dependencies (Oepen et al., 2016)

• AMR (Banarescu et al., 2013)

• UCCA (Abend and Rappoport, 2013)

• Other semantic representation schemes1

Abstract away from syntactic detail that does not affect meaning:

. . . bathed = . . . took a bath

1See recent survey (Abend and Rappoport, 2017)

Page 7: A Transition-Based Directed Acyclic Graph Parser for Universal … · 2020. 9. 11. · 2 TUPA — Transition-based UCCA Parser The first parser to support the combination of three

7

Syntactic Dependencies

• Bilexical tree: syntactic structure representation.• Fast and accurate parsers (e.g. transition-based).

You want to take a long bath

root

nsubj

xcomp

mark

dobj

det

amod

Non-projectivity (discontinuity) is a challenge (Nivre, 2009).

A hearing is scheduled on the issue today

root

det

nsubj:pass

aux:pass

case

det

nmod

nmod:tmod

Page 8: A Transition-Based Directed Acyclic Graph Parser for Universal … · 2020. 9. 11. · 2 TUPA — Transition-based UCCA Parser The first parser to support the combination of three

8

Semantic Dependencies

• Bilexical graph: predicate-argument representation.• Derived from theories of syntax-semantics interface.

You want to take a long bath

topARG2

ARG1

ARG1ARG2

BV

ARG1

DELPH-IN MRS-derived bi-lexical dependencies (DM).

After graduation , Joe moved to Paris

topTWHEN

ACT-arg

DIR3-arg

Prague Dependency Treebank tectogrammatical layer (PSD).

Page 9: A Transition-Based Directed Acyclic Graph Parser for Universal … · 2020. 9. 11. · 2 TUPA — Transition-based UCCA Parser The first parser to support the combination of three

9

The UCCA Semantic Representation Scheme

Page 10: A Transition-Based Directed Acyclic Graph Parser for Universal … · 2020. 9. 11. · 2 TUPA — Transition-based UCCA Parser The first parser to support the combination of three

10

Universal Conceptual Cognitive Annotation (UCCA)

After

L

graduation

P

H

,

U

Joe

A

moved

P

to

R

Paris

C

A

H

A

—– primary edge

- - - remote edge

After graduation, Joe moved to ParisP process S state A participantL linker H linked scene C centerE elaborator D adverbial R relatorN connector U punctuation F functionG ground

Page 11: A Transition-Based Directed Acyclic Graph Parser for Universal … · 2020. 9. 11. · 2 TUPA — Transition-based UCCA Parser The first parser to support the combination of three

11

The UCCA Semantic Representation Scheme

• Cross-linguistically applicable (Abend and Rappoport, 2013).• Stable in translation (Sulem et al., 2015).• Fast and intuitive to annotate (Abend et al., 2017).• Facilitates MT human evaluation (Birch et al., 2016).

English

Hebrew

Page 12: A Transition-Based Directed Acyclic Graph Parser for Universal … · 2020. 9. 11. · 2 TUPA — Transition-based UCCA Parser The first parser to support the combination of three

12

Graph StructureUCCA generates a directed acyclic graph (DAG).Text tokens are terminals, complex units are non-terminal nodes.Remote edges enable reentrancy for argument sharing.Phrases may be discontinuous (e.g., multi-word expressions).

You

A

want

P

to

F

take

C

a

F

long bath

C

P

A

A

D

—– primary edge

- - - remote edge

You want to take a long bath

P processA participantC centerD adverbialF function

Page 13: A Transition-Based Directed Acyclic Graph Parser for Universal … · 2020. 9. 11. · 2 TUPA — Transition-based UCCA Parser The first parser to support the combination of three

13

Transition-based UCCA Parsing

Page 14: A Transition-Based Directed Acyclic Graph Parser for Universal … · 2020. 9. 11. · 2 TUPA — Transition-based UCCA Parser The first parser to support the combination of three

14

Transition-Based ParsingFirst used for dependency parsing (Nivre, 2004).Parse text w1 . . . wn to graph G incrementally by applyingtransitions to the parser state: stack, buffer and constructed graph.

Initial state:

stack buffer

You want to take a long bath

TUPA transitions:{Shift, Reduce, NodeX , Left-EdgeX , Right-EdgeX ,

Left-RemoteX , Right-RemoteX , Swap, Finish}

Support non-terminal nodes, reentrancy and discontinuity.

Page 15: A Transition-Based Directed Acyclic Graph Parser for Universal … · 2020. 9. 11. · 2 TUPA — Transition-based UCCA Parser The first parser to support the combination of three

15

Transition-Based ParsingFirst used for dependency parsing (Nivre, 2004).Parse text w1 . . . wn to graph G incrementally by applyingtransitions to the parser state: stack, buffer and constructed graph.

Initial state:

stack buffer

You want to take a long bath

TUPA transitions:{Shift, Reduce, NodeX , Left-EdgeX , Right-EdgeX ,

Left-RemoteX , Right-RemoteX , Swap, Finish}

Support non-terminal nodes, reentrancy and discontinuity.

Page 16: A Transition-Based Directed Acyclic Graph Parser for Universal … · 2020. 9. 11. · 2 TUPA — Transition-based UCCA Parser The first parser to support the combination of three

16

Transition-Based ParsingFirst used for dependency parsing (Nivre, 2004).Parse text w1 . . . wn to graph G incrementally by applyingtransitions to the parser state: stack, buffer and constructed graph.

Initial state:

stack buffer

You want to take a long bath

TUPA transitions:{Shift, Reduce, NodeX , Left-EdgeX , Right-EdgeX ,

Left-RemoteX , Right-RemoteX , Swap, Finish}

Support non-terminal nodes, reentrancy and discontinuity.

Page 17: A Transition-Based Directed Acyclic Graph Parser for Universal … · 2020. 9. 11. · 2 TUPA — Transition-based UCCA Parser The first parser to support the combination of three

17

Example⇒ Shift

stack

You

buffer

want to take a long bath

graph

You

A

want

P

to

F

take

C

a

F

long bath

C

P

A

A

D

Page 18: A Transition-Based Directed Acyclic Graph Parser for Universal … · 2020. 9. 11. · 2 TUPA — Transition-based UCCA Parser The first parser to support the combination of three

18

Example⇒ Right-EdgeA

stack

You

buffer

want to take a long bath

graph

You

A

want

P

to

F

take

C

a

F

long bath

C

P

A

A

D

Page 19: A Transition-Based Directed Acyclic Graph Parser for Universal … · 2020. 9. 11. · 2 TUPA — Transition-based UCCA Parser The first parser to support the combination of three

19

Example⇒ Shift

stack

You want

buffer

to take a long bath

graph

You

A

want

P

to

F

take

C

a

F

long bath

C

P

A

A

D

Page 20: A Transition-Based Directed Acyclic Graph Parser for Universal … · 2020. 9. 11. · 2 TUPA — Transition-based UCCA Parser The first parser to support the combination of three

20

Example⇒ Swap

stack

want

buffer

You to take a long bath

graph

You

A

want

P

to

F

take

C

a

F

long bath

C

P

A

A

D

Page 21: A Transition-Based Directed Acyclic Graph Parser for Universal … · 2020. 9. 11. · 2 TUPA — Transition-based UCCA Parser The first parser to support the combination of three

21

Example⇒ Right-EdgeP

stack

want

buffer

You to take a long bath

graph

You

A

want

P

to

F

take

C

a

F

long bath

C

P

A

A

D

Page 22: A Transition-Based Directed Acyclic Graph Parser for Universal … · 2020. 9. 11. · 2 TUPA — Transition-based UCCA Parser The first parser to support the combination of three

22

Example⇒ Reduce

stack buffer

to take a long bath

graph

You

A

want

P

to

F

take

C

a

F

long bath

C

P

A

A

D

Page 23: A Transition-Based Directed Acyclic Graph Parser for Universal … · 2020. 9. 11. · 2 TUPA — Transition-based UCCA Parser The first parser to support the combination of three

23

Example⇒ Shift

stack

You

buffer

to take a long bath

graph

You

A

want

P

to

F

take

C

a

F

long bath

C

P

A

A

D

Page 24: A Transition-Based Directed Acyclic Graph Parser for Universal … · 2020. 9. 11. · 2 TUPA — Transition-based UCCA Parser The first parser to support the combination of three

24

Example⇒ Shift

stack

You to

buffer

take a long bath

graph

You

A

want

P

to

F

take

C

a

F

long bath

C

P

A

A

D

Page 25: A Transition-Based Directed Acyclic Graph Parser for Universal … · 2020. 9. 11. · 2 TUPA — Transition-based UCCA Parser The first parser to support the combination of three

25

Example⇒ NodeF

stack

You to

buffer

take a long bath

graph

You

A

want

P

to

F

take

C

a

F

long bath

C

P

A

A

D

Page 26: A Transition-Based Directed Acyclic Graph Parser for Universal … · 2020. 9. 11. · 2 TUPA — Transition-based UCCA Parser The first parser to support the combination of three

26

Example⇒ Reduce

stack

You

buffer

take a long bath

graph

You

A

want

P

to

F

take

C

a

F

long bath

C

P

A

A

D

Page 27: A Transition-Based Directed Acyclic Graph Parser for Universal … · 2020. 9. 11. · 2 TUPA — Transition-based UCCA Parser The first parser to support the combination of three

27

Example⇒ Shift

stack

You

buffer

take a long bath

graph

You

A

want

P

to

F

take

C

a

F

long bath

C

P

A

A

D

Page 28: A Transition-Based Directed Acyclic Graph Parser for Universal … · 2020. 9. 11. · 2 TUPA — Transition-based UCCA Parser The first parser to support the combination of three

28

Example⇒ Shift

stack

You take

buffer

a long bath

graph

You

A

want

P

to

F

take

C

a

F

long bath

C

P

A

A

D

Page 29: A Transition-Based Directed Acyclic Graph Parser for Universal … · 2020. 9. 11. · 2 TUPA — Transition-based UCCA Parser The first parser to support the combination of three

29

Example⇒ NodeC

stack

You take

buffer

a long bath

graph

You

A

want

P

to

F

take

C

a

F

long bath

C

P

A

A

D

Page 30: A Transition-Based Directed Acyclic Graph Parser for Universal … · 2020. 9. 11. · 2 TUPA — Transition-based UCCA Parser The first parser to support the combination of three

30

Example⇒ Reduce

stack

You

buffer

a long bath

graph

You

A

want

P

to

F

take

C

a

F

long bath

C

P

A

A

D

Page 31: A Transition-Based Directed Acyclic Graph Parser for Universal … · 2020. 9. 11. · 2 TUPA — Transition-based UCCA Parser The first parser to support the combination of three

31

Example⇒ Shift

stack

You

buffer

a long bath

graph

You

A

want

P

to

F

take

C

a

F

long bath

C

P

A

A

D

Page 32: A Transition-Based Directed Acyclic Graph Parser for Universal … · 2020. 9. 11. · 2 TUPA — Transition-based UCCA Parser The first parser to support the combination of three

32

Example⇒ Right-EdgeP

stack

You

buffer

a long bath

graph

You

A

want

P

to

F

take

C

a

F

long bath

C

P

A

A

D

Page 33: A Transition-Based Directed Acyclic Graph Parser for Universal … · 2020. 9. 11. · 2 TUPA — Transition-based UCCA Parser The first parser to support the combination of three

33

Example⇒ Shift

stack

You a

buffer

long bath

graph

You

A

want

P

to

F

take

C

a

F

long bath

C

P

A

A

D

Page 34: A Transition-Based Directed Acyclic Graph Parser for Universal … · 2020. 9. 11. · 2 TUPA — Transition-based UCCA Parser The first parser to support the combination of three

34

Example⇒ Right-EdgeF

stack

You a

buffer

long bath

graph

You

A

want

P

to

F

take

C

a

F

long bath

C

P

A

A

D

Page 35: A Transition-Based Directed Acyclic Graph Parser for Universal … · 2020. 9. 11. · 2 TUPA — Transition-based UCCA Parser The first parser to support the combination of three

35

Example⇒ Reduce

stack

You

buffer

long bath

graph

You

A

want

P

to

F

take

C

a

F

long bath

C

P

A

A

D

Page 36: A Transition-Based Directed Acyclic Graph Parser for Universal … · 2020. 9. 11. · 2 TUPA — Transition-based UCCA Parser The first parser to support the combination of three

36

Example⇒ Shift

stack

You long

buffer

bath

graph

You

A

want

P

to

F

take

C

a

F

long bath

C

P

A

A

D

Page 37: A Transition-Based Directed Acyclic Graph Parser for Universal … · 2020. 9. 11. · 2 TUPA — Transition-based UCCA Parser The first parser to support the combination of three

37

Example⇒ Swap

stack

You long

buffer

bath

graph

You

A

want

P

to

F

take

C

a

F

long bath

C

P

A

A

D

Page 38: A Transition-Based Directed Acyclic Graph Parser for Universal … · 2020. 9. 11. · 2 TUPA — Transition-based UCCA Parser The first parser to support the combination of three

38

Example⇒ Right-EdgeD

stack

You long

buffer

bath

graph

You

A

want

P

to

F

take

C

a

F

long bath

C

P

A

A

D

Page 39: A Transition-Based Directed Acyclic Graph Parser for Universal … · 2020. 9. 11. · 2 TUPA — Transition-based UCCA Parser The first parser to support the combination of three

39

Example⇒ Reduce

stack

You

buffer

bath

graph

You

A

want

P

to

F

take

C

a

F

long bath

C

P

A

A

D

Page 40: A Transition-Based Directed Acyclic Graph Parser for Universal … · 2020. 9. 11. · 2 TUPA — Transition-based UCCA Parser The first parser to support the combination of three

40

Example⇒ Swap

stack buffer

You bath

graph

You

A

want

P

to

F

take

C

a

F

long bath

C

P

A

A

D

Page 41: A Transition-Based Directed Acyclic Graph Parser for Universal … · 2020. 9. 11. · 2 TUPA — Transition-based UCCA Parser The first parser to support the combination of three

41

Example⇒ Right-EdgeA

stack buffer

You bath

graph

You

A

want

P

to

F

take

C

a

F

long bath

C

P

A

A

D

Page 42: A Transition-Based Directed Acyclic Graph Parser for Universal … · 2020. 9. 11. · 2 TUPA — Transition-based UCCA Parser The first parser to support the combination of three

42

Example⇒ Reduce

stack buffer

You bath

graph

You

A

want

P

to

F

take

C

a

F

long bath

C

P

A

A

D

Page 43: A Transition-Based Directed Acyclic Graph Parser for Universal … · 2020. 9. 11. · 2 TUPA — Transition-based UCCA Parser The first parser to support the combination of three

43

Example⇒ Reduce

stack buffer

You bath

graph

You

A

want

P

to

F

take

C

a

F

long bath

C

P

A

A

D

Page 44: A Transition-Based Directed Acyclic Graph Parser for Universal … · 2020. 9. 11. · 2 TUPA — Transition-based UCCA Parser The first parser to support the combination of three

44

Example⇒ Shift

stack

You

buffer

bath

graph

You

A

want

P

to

F

take

C

a

F

long bath

C

P

A

A

D

Page 45: A Transition-Based Directed Acyclic Graph Parser for Universal … · 2020. 9. 11. · 2 TUPA — Transition-based UCCA Parser The first parser to support the combination of three

45

Example⇒ Shift

stack

You

buffer

bath

graph

You

A

want

P

to

F

take

C

a

F

long bath

C

P

A

A

D

Page 46: A Transition-Based Directed Acyclic Graph Parser for Universal … · 2020. 9. 11. · 2 TUPA — Transition-based UCCA Parser The first parser to support the combination of three

46

Example⇒ Left-RemoteA

stack

You

buffer

bath

graph

You

A

want

P

to

F

take

C

a

F

long bath

C

P

A

A

D

Page 47: A Transition-Based Directed Acyclic Graph Parser for Universal … · 2020. 9. 11. · 2 TUPA — Transition-based UCCA Parser The first parser to support the combination of three

47

Example⇒ Shift

stack

You bath

buffer

graph

You

A

want

P

to

F

take

C

a

F

long bath

C

P

A

A

D

Page 48: A Transition-Based Directed Acyclic Graph Parser for Universal … · 2020. 9. 11. · 2 TUPA — Transition-based UCCA Parser The first parser to support the combination of three

48

Example⇒ Right-EdgeC

stack

You bath

buffer

graph

You

A

want

P

to

F

take

C

a

F

long bath

C

P

A

A

D

Page 49: A Transition-Based Directed Acyclic Graph Parser for Universal … · 2020. 9. 11. · 2 TUPA — Transition-based UCCA Parser The first parser to support the combination of three

49

Example⇒ Finish

stack

You bath

buffer

graph

You

A

want

P

to

F

take

C

a

F

long bath

C

P

A

A

D

Page 50: A Transition-Based Directed Acyclic Graph Parser for Universal … · 2020. 9. 11. · 2 TUPA — Transition-based UCCA Parser The first parser to support the combination of three

50

TrainingAn oracle provides the transition sequence given the correct graph:

You

A

want

P

to

F

take

C

a

F

long bath

C

P

A

A

D

Shift, Right-EdgeA, Shift, Swap, Right-EdgeP , Reduce, Shift,Shift, NodeF , Reduce, Shift, Shift, NodeC , Reduce, Shift,Right-EdgeP , Shift, Right-EdgeF , Reduce, Shift, Swap,Right-EdgeD , Reduce, Swap, Right-EdgeA, Reduce, Reduce, Shift,Shift, Left-RemoteA, Shift, Right-EdgeC , Finish

Page 51: A Transition-Based Directed Acyclic Graph Parser for Universal … · 2020. 9. 11. · 2 TUPA — Transition-based UCCA Parser The first parser to support the combination of three

51

TUPA ModelLearn to greedily predict transition based on current state.Experimenting with three classifiers:

Sparse Perceptron with sparse features (Zhang and Nivre, 2011).MLP Embeddings + feedforward NN (Chen and Manning, 2014).BiLSTM Embeddings + deep bidirectional LSTM + MLP

(Kiperwasser and Goldberg, 2016).

Effective “lookahead” encoded in the representation.

Features: words, POS, syntactic dependencies, existing edge labelsfrom the stack and buffer + parents, children, grandchildren;ordinal features (height, number of parents and children)

stack buffer

stack You take

buffer a long bath

graph

YouA

wantP

toF

takeC

aF

long bathC

You

LSTM

LSTM

LSTM

LSTM

want

LSTM

LSTM

LSTM

LSTM

to

LSTM

LSTM

LSTM

LSTM

take

LSTM

LSTM

LSTM

LSTM

a

LSTM

LSTM

LSTM

LSTM

long

LSTM

LSTM

LSTM

LSTM

bath

LSTM

LSTM

LSTM

LSTM

MLP

NodeC

Page 52: A Transition-Based Directed Acyclic Graph Parser for Universal … · 2020. 9. 11. · 2 TUPA — Transition-based UCCA Parser The first parser to support the combination of three

52

TUPA ModelLearn to greedily predict transition based on current state.Experimenting with three classifiers:

Sparse Perceptron with sparse features (Zhang and Nivre, 2011).MLP Embeddings + feedforward NN (Chen and Manning, 2014).BiLSTM Embeddings + deep bidirectional LSTM + MLP

(Kiperwasser and Goldberg, 2016).

Effective “lookahead” encoded in the representation.

stack You take

buffer a long bath

graph

YouA

wantP

toF

takeC

aF

long bathC

You

LSTM

LSTM

LSTM

LSTM

want

LSTM

LSTM

LSTM

LSTM

to

LSTM

LSTM

LSTM

LSTM

take

LSTM

LSTM

LSTM

LSTM

a

LSTM

LSTM

LSTM

LSTM

long

LSTM

LSTM

LSTM

LSTM

bath

LSTM

LSTM

LSTM

LSTM

MLP

NodeC

Page 53: A Transition-Based Directed Acyclic Graph Parser for Universal … · 2020. 9. 11. · 2 TUPA — Transition-based UCCA Parser The first parser to support the combination of three

53

TUPA ModelLearn to greedily predict transition based on current state.Experimenting with three classifiers:

Sparse Perceptron with sparse features (Zhang and Nivre, 2011).MLP Embeddings + feedforward NN (Chen and Manning, 2014).BiLSTM Embeddings + deep bidirectional LSTM + MLP

(Kiperwasser and Goldberg, 2016).

Effective “lookahead” encoded in the representation.

stack You take

buffer a long bath

graph

YouA

wantP

toF

takeC

aF

long bathC

You

LSTM

LSTM

LSTM

LSTM

want

LSTM

LSTM

LSTM

LSTM

to

LSTM

LSTM

LSTM

LSTM

take

LSTM

LSTM

LSTM

LSTM

a

LSTM

LSTM

LSTM

LSTM

long

LSTM

LSTM

LSTM

LSTM

bath

LSTM

LSTM

LSTM

LSTM

MLP

NodeC

Page 54: A Transition-Based Directed Acyclic Graph Parser for Universal … · 2020. 9. 11. · 2 TUPA — Transition-based UCCA Parser The first parser to support the combination of three

54

TUPA ModelLearn to greedily predict transition based on current state.Experimenting with three classifiers:

Sparse Perceptron with sparse features (Zhang and Nivre, 2011).MLP Embeddings + feedforward NN (Chen and Manning, 2014).BiLSTM Embeddings + deep bidirectional LSTM + MLP

(Kiperwasser and Goldberg, 2016).

Effective “lookahead” encoded in the representation.

stack You take

buffer a long bath

graph

YouA

wantP

toF

takeC

aF

long bathC

You

LSTM

LSTM

LSTM

LSTM

want

LSTM

LSTM

LSTM

LSTM

to

LSTM

LSTM

LSTM

LSTM

take

LSTM

LSTM

LSTM

LSTM

a

LSTM

LSTM

LSTM

LSTM

long

LSTM

LSTM

LSTM

LSTM

bath

LSTM

LSTM

LSTM

LSTM

MLP

NodeC

Page 55: A Transition-Based Directed Acyclic Graph Parser for Universal … · 2020. 9. 11. · 2 TUPA — Transition-based UCCA Parser The first parser to support the combination of three

55

TUPA ModelLearn to greedily predict transition based on current state.Experimenting with three classifiers:

Sparse Perceptron with sparse features (Zhang and Nivre, 2011).MLP Embeddings + feedforward NN (Chen and Manning, 2014).BiLSTM Embeddings + deep bidirectional LSTM + MLP

(Kiperwasser and Goldberg, 2016).

Effective “lookahead” encoded in the representation.

stack You take

buffer a long bath

graph

YouA

wantP

toF

takeC

aF

long bathC

You

LSTM

LSTM

LSTM

LSTM

want

LSTM

LSTM

LSTM

LSTM

to

LSTM

LSTM

LSTM

LSTM

take

LSTM

LSTM

LSTM

LSTM

a

LSTM

LSTM

LSTM

LSTM

long

LSTM

LSTM

LSTM

LSTM

bath

LSTM

LSTM

LSTM

LSTM

MLP

NodeC

Page 56: A Transition-Based Directed Acyclic Graph Parser for Universal … · 2020. 9. 11. · 2 TUPA — Transition-based UCCA Parser The first parser to support the combination of three

56

stack You take

buffer a long bath

graph

YouA

wantP

toF

takeC

aF

long bathC

You

LSTM

LSTM

LSTM

LSTM

want

LSTM

LSTM

LSTM

LSTM

to

LSTM

LSTM

LSTM

LSTM

take

LSTM

LSTM

LSTM

LSTM

a

LSTM

LSTM

LSTM

LSTM

long

LSTM

LSTM

LSTM

LSTM

bath

LSTM

LSTM

LSTM

LSTM

MLP

NodeC

Page 57: A Transition-Based Directed Acyclic Graph Parser for Universal … · 2020. 9. 11. · 2 TUPA — Transition-based UCCA Parser The first parser to support the combination of three

57

Experiments

Page 58: A Transition-Based Directed Acyclic Graph Parser for Universal … · 2020. 9. 11. · 2 TUPA — Transition-based UCCA Parser The first parser to support the combination of three

58

Experimental Setup

• UCCA Wikipedia corpus (train4268 +

dev454 +

test503 sentences).

• Out-of-domain: English part of English-French parallel corpus,Twenty Thousand Leagues Under the Sea (506 sentences).

Page 59: A Transition-Based Directed Acyclic Graph Parser for Universal … · 2020. 9. 11. · 2 TUPA — Transition-based UCCA Parser The first parser to support the combination of three

59

BaselinesNo existing UCCA parsers ⇒ conversion-based approximation.Bilexical DAG parsers (allow reentrancy):

• DAGParser (Ribeyre et al., 2014): transition-based.• TurboParser (Almeida and Martins, 2015): graph-based.

Tree parsers (all transition-based):• MaltParser (Nivre et al., 2007): bilexical tree parser.• Stack LSTM Parser (Dyer et al., 2015): bilexical tree parser.• uparse (Maier, 2015): allows non-terminals, discontinuity.

You want to take a long bath

A

A

A

F F

D

C

UCCA bilexical DAG approximation (for tree, delete remote edges).

Page 60: A Transition-Based Directed Acyclic Graph Parser for Universal … · 2020. 9. 11. · 2 TUPA — Transition-based UCCA Parser The first parser to support the combination of three

60

Bilexical Graph Approximation

1. Convert UCCA to bilexical dependencies.2. Train bilexical parsers and apply to test sentences.3. Reconstruct UCCA graphs and compare with gold standard.

After

L

graduation

P

H

,U

Joe

A

moved

P

to

R

Paris

C

A

H

A

After graduation , Joe moved to Paris

L U

A

A

H

R

A

Page 61: A Transition-Based Directed Acyclic Graph Parser for Universal … · 2020. 9. 11. · 2 TUPA — Transition-based UCCA Parser The first parser to support the combination of three

61

EvaluationComparing graphs over the same sequence of tokens,

• Match edges by their terminal yield and label.• Calculate labeled precision, recall and F1 scores.• Separate primary and remote edges.

gold

After

L

graduation

P

H,

U

Joe

A

moved

P

to

R

Paris

C

A

H

A

predicted

After

L

graduation

S

H,

U

Joe

A

moved

P

to

F

Paris

A

H

A

A

Primary: LP LR LF69 = 67% 6

10 = 60% 64% Remote: LP LR LF12 = 50% 1

1 = 100% 67%

Page 62: A Transition-Based Directed Acyclic Graph Parser for Universal … · 2020. 9. 11. · 2 TUPA — Transition-based UCCA Parser The first parser to support the combination of three

62

ResultsTUPABiLSTM obtains the highest F-scores in all metrics:

Primary edges Remote edgesLP LR LF LP LR LF

TUPASparse 64.5 63.7 64.1 19.8 13.4 16TUPAMLP 65.2 64.6 64.9 23.7 13.2 16.9TUPABiLSTM 74.4 72.7 73.5 47.4 51.6 49.4Bilexical DAG (91) (58.3)DAGParser 61.8 55.8 58.6 9.5 0.5 1TurboParser 57.7 46 51.2 77.8 1.8 3.7Bilexical tree (91) –MaltParser 62.8 57.7 60.2 – – –Stack LSTM 73.2 66.9 69.9 – – –Tree (100) –uparse 60.9 61.2 61.1 – – –

Results on the Wiki test set.

Page 63: A Transition-Based Directed Acyclic Graph Parser for Universal … · 2020. 9. 11. · 2 TUPA — Transition-based UCCA Parser The first parser to support the combination of three

63

ResultsComparable on out-of-domain test set:

Primary edges Remote edgesLP LR LF LP LR LF

TUPASparse 59.6 59.9 59.8 22.2 7.7 11.5TUPAMLP 62.3 62.6 62.5 20.9 6.3 9.7TUPABiLSTM 68.7 68.5 68.6 38.6 18.8 25.3Bilexical DAG (91.3) (43.4)DAGParser 56.4 50.6 53.4 – 0 0TurboParser 50.3 37.7 43.1 100 0.4 0.8Bilexical tree (91.3) –MaltParser 57.8 53 55.3 – – –Stack LSTM 66.1 61.1 63.5 – – –Tree (100) –uparse 52.7 52.8 52.8 – – –

Results on the 20K Leagues out-of-domain set.

Page 64: A Transition-Based Directed Acyclic Graph Parser for Universal … · 2020. 9. 11. · 2 TUPA — Transition-based UCCA Parser The first parser to support the combination of three

64

Discussion

Page 65: A Transition-Based Directed Acyclic Graph Parser for Universal … · 2020. 9. 11. · 2 TUPA — Transition-based UCCA Parser The first parser to support the combination of three

65

Fine-Grained AnalysisEvaluation of TUPABiLSTM per edge type:

Page 66: A Transition-Based Directed Acyclic Graph Parser for Universal … · 2020. 9. 11. · 2 TUPA — Transition-based UCCA Parser The first parser to support the combination of three

66

Online Demohttp://bit.ly/tupademo

Page 67: A Transition-Based Directed Acyclic Graph Parser for Universal … · 2020. 9. 11. · 2 TUPA — Transition-based UCCA Parser The first parser to support the combination of three

67

Error AnalysisCopular clauses tend to be parsed as identity.

But, from the guidelines2:

JohnA[isF

[[sixE yearsC ]E oldC

]C

]S

2http://www.cs.huji.ac.il/˜oabend/ucca/guidelines.pdf

Page 68: A Transition-Based Directed Acyclic Graph Parser for Universal … · 2020. 9. 11. · 2 TUPA — Transition-based UCCA Parser The first parser to support the combination of three

68

Error AnalysisThe participant category is used when adverbial should be.

Page 69: A Transition-Based Directed Acyclic Graph Parser for Universal … · 2020. 9. 11. · 2 TUPA — Transition-based UCCA Parser The first parser to support the combination of three

69

Future Work

Page 70: A Transition-Based Directed Acyclic Graph Parser for Universal … · 2020. 9. 11. · 2 TUPA — Transition-based UCCA Parser The first parser to support the combination of three

70

Broad-Coverage UCCA ParsingAlready annotated in UCCA, but not yet handled by TUPA:

• Linkage: inter-scene relations (see example).• Implicit units: units not mentioned at all in the text.• Inter-sentence relations: discourse structure.

LR link relationLA link argument

After

L

graduation

P

H

,U

Joe

A

moved

P

to

R

Paris

C

A

H

A

LR

LA

LA

UCCA graph with a Linkage relation.

Page 71: A Transition-Based Directed Acyclic Graph Parser for Universal … · 2020. 9. 11. · 2 TUPA — Transition-based UCCA Parser The first parser to support the combination of three

71

AMR ParsingSimilar in structure and content, but poses several challenges:

• Node labels: not just edges, not also nodes are labeled.• Partial alignment: orphan tokens, implicit concepts.

move-01

after

graduate-01

op1

time

person

name

”John”

op1nam

eA

RG0

city

name

”Paris”

op1nam

e

ARG2

ARG0

AMR graph.

Page 72: A Transition-Based Directed Acyclic Graph Parser for Universal … · 2020. 9. 11. · 2 TUPA — Transition-based UCCA Parser The first parser to support the combination of three

72

AMR ParsingSimilar in structure and content, but poses several challenges:

• Node labels: not just edges, not also nodes are labeled.• Partial alignment: orphan tokens, implicit concepts.

〈`〉-01

〈`〉

After

Term

inal

〈v〉-01

graduation ,Term

inal

op

time

person

”〈T〉”

John

Terminal

name

ARG

0

city

moved to

”〈T〉”

Paris

Terminal

name

ARG2

Terminal

ARG0

AMR graph in UCCA++ format.

Page 73: A Transition-Based Directed Acyclic Graph Parser for Universal … · 2020. 9. 11. · 2 TUPA — Transition-based UCCA Parser The first parser to support the combination of three

73

Semantic Dependency ParsingSimilar structure, but without non-terminal nodes.By applying bilexical conversion in reverse, TUPA can be used.

After graduation , John moved to Paris

ARG2 ARG1

ARG1

top

ARG2

ARG1 ARG2

SDP graph (in the DM formalism).

Page 74: A Transition-Based Directed Acyclic Graph Parser for Universal … · 2020. 9. 11. · 2 TUPA — Transition-based UCCA Parser The first parser to support the combination of three

74

Semantic Dependency ParsingSimilar structure, but without non-terminal nodes.By applying bilexical conversion in reverse, TUPA can be used.

After graduation ,

root

John

ARG1

moved

head

to Paris

root

top

head

ARG2

ARG1 ARG1

head

ARG2

ARG2

SDP graph in UCCA++ format.

Page 75: A Transition-Based Directed Acyclic Graph Parser for Universal … · 2020. 9. 11. · 2 TUPA — Transition-based UCCA Parser The first parser to support the combination of three

75

Conclusion

• UCCA’s semantic distinctions require a graph structureincluding non-terminals, reentrancy and discontinuity.

• TUPA is an accurate transition-based UCCA parser, and thefirst to support UCCA and any DAG over the text tokens.

• Outperforms strong conversion-based baselines.

Future Work:• More languages (German corpus construction is underway).• Broad coverage UCCA parsing.• Parsing other schemes, such as AMR and SDP.• Text simplification, MT evaluation and other applications.

Code: github.com/danielhers/tupaDemo: bit.ly/tupademoCorpora: cs.huji.ac.il/˜oabend/ucca.html

Thank you!

Page 76: A Transition-Based Directed Acyclic Graph Parser for Universal … · 2020. 9. 11. · 2 TUPA — Transition-based UCCA Parser The first parser to support the combination of three

76

Conclusion

• UCCA’s semantic distinctions require a graph structureincluding non-terminals, reentrancy and discontinuity.

• TUPA is an accurate transition-based UCCA parser, and thefirst to support UCCA and any DAG over the text tokens.

• Outperforms strong conversion-based baselines.Future Work:

• More languages (German corpus construction is underway).• Broad coverage UCCA parsing.• Parsing other schemes, such as AMR and SDP.• Text simplification, MT evaluation and other applications.

Code: github.com/danielhers/tupaDemo: bit.ly/tupademoCorpora: cs.huji.ac.il/˜oabend/ucca.html

Thank you!

Page 77: A Transition-Based Directed Acyclic Graph Parser for Universal … · 2020. 9. 11. · 2 TUPA — Transition-based UCCA Parser The first parser to support the combination of three

77

Conclusion

• UCCA’s semantic distinctions require a graph structureincluding non-terminals, reentrancy and discontinuity.

• TUPA is an accurate transition-based UCCA parser, and thefirst to support UCCA and any DAG over the text tokens.

• Outperforms strong conversion-based baselines.Future Work:

• More languages (German corpus construction is underway).• Broad coverage UCCA parsing.• Parsing other schemes, such as AMR and SDP.• Text simplification, MT evaluation and other applications.

Code: github.com/danielhers/tupaDemo: bit.ly/tupademoCorpora: cs.huji.ac.il/˜oabend/ucca.html

Thank you!

Page 78: A Transition-Based Directed Acyclic Graph Parser for Universal … · 2020. 9. 11. · 2 TUPA — Transition-based UCCA Parser The first parser to support the combination of three

78

References IAbend, O. and Rappoport, A. (2013).

Universal Conceptual Cognitive Annotation (UCCA).In Proc. of ACL, pages 228–238.

Abend, O. and Rappoport, A. (2017).The state of the art in semantic representation.In Proc. of ACL.

Abend, O., Yerushalmi, S., and Rappoport, A. (2017).Uccaapp: Web-application for syntactic and semantic phrase-based annotation.Proceedings of ACL 2017, System Demonstrations, pages 109–114.

Almeida, M. S. C. and Martins, A. F. T. (2015).Lisbon: Evaluating TurboSemanticParser on multiple languages and out-of-domain data.In Proc. of SemEval, pages 970–973.

Banarescu, L., Bonial, C., Cai, S., Georgescu, M., Griffitt, K., Hermjakob, U., Knight, K., Palmer, M., andSchneider, N. (2013).Abstract Meaning Representation for sembanking.In Proc. of the Linguistic Annotation Workshop.

Birch, A., Abend, O., Bojar, O., and Haddow, B. (2016).HUME: Human UCCA-based evaluation of machine translation.In Proc. of EMNLP, pages 1264–1274.

Chen, D. and Manning, C. (2014).A fast and accurate dependency parser using neural networks.In Proc. of EMNLP, pages 740–750.

Page 79: A Transition-Based Directed Acyclic Graph Parser for Universal … · 2020. 9. 11. · 2 TUPA — Transition-based UCCA Parser The first parser to support the combination of three

79

References IIDyer, C., Ballesteros, M., Ling, W., Matthews, A., and Smith, N. A. (2015).

Transition-based dependeny parsing with stack long short-term memory.In Proc. of ACL, pages 334–343.

Kiperwasser, E. and Goldberg, Y. (2016).Simple and accurate dependency parsing using bidirectional LSTM feature representations.TACL, 4:313–327.

Maier, W. (2015).Discontinuous incremental shift-reduce parsing.In Proc. of ACL, pages 1202–1212.

Nivre, J. (2004).Incrementality in deterministic dependency parsing.In Keller, F., Clark, S., Crocker, M., and Steedman, M., editors, Proceedings of the ACL WorkshopIncremental Parsing: Bringing Engineering and Cognition Together, pages 50–57, Barcelona, Spain.Association for Computational Linguistics.

Nivre, J. (2009).Non-projective dependency parsing in expected linear time.In Proc. of ACL, pages 351–359.

Nivre, J., Hall, J., Nilsson, J., Chanev, A., Eryigit, G., Kubler, S., Marinov, S., and Marsi, E. (2007).MaltParser: A language-independent system for data-driven dependency parsing.Natural Language Engineering, 13(02):95–135.

Oepen, S., Kuhlmann, M., Miyao, Y., Zeman, D., Cinkova, S., Flickinger, D., Hajic, J., Ivanova, A., and Uresova,Z. (2016).Towards comparability of linguistic graph banks for semantic parsing.In LREC.

Page 80: A Transition-Based Directed Acyclic Graph Parser for Universal … · 2020. 9. 11. · 2 TUPA — Transition-based UCCA Parser The first parser to support the combination of three

80

References IIIRibeyre, C., Villemonte de la Clergerie, E., and Seddah, D. (2014).

Alpage: Transition-based semantic graph parsing with syntactic features.In Proc. of SemEval, pages 97–103.

Sulem, E., Abend, O., and Rappoport, A. (2015).Conceptual annotations preserve structure across translations: A French-English case study.In Proc. of S2MT, pages 11–22.

Zhang, Y. and Nivre, J. (2011).Transition-based dependency parsing with rich non-local features.In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: HumanLanguage Technologies, pages 188–193.

Page 81: A Transition-Based Directed Acyclic Graph Parser for Universal … · 2020. 9. 11. · 2 TUPA — Transition-based UCCA Parser The first parser to support the combination of three

81

Backup

Page 82: A Transition-Based Directed Acyclic Graph Parser for Universal … · 2020. 9. 11. · 2 TUPA — Transition-based UCCA Parser The first parser to support the combination of three

82

UCCA CorporaWiki 20K

Train Dev Test Leagues# passages 300 34 33 154# sentences 4268 454 503 506# nodes 298,993 33,704 35,718 29,315% terminal 42.96 43.54 42.87 42.09% non-term. 58.33 57.60 58.35 60.01% discont. 0.54 0.53 0.44 0.81% reentrant 2.38 1.88 2.15 2.03# edges 287,914 32,460 34,336 27,749% primary 98.25 98.75 98.74 97.73% remote 1.75 1.25 1.26 2.27Average per non-terminal node# children 1.67 1.68 1.66 1.61

Corpus statistics.

Page 83: A Transition-Based Directed Acyclic Graph Parser for Universal … · 2020. 9. 11. · 2 TUPA — Transition-based UCCA Parser The first parser to support the combination of three

83

EvaluationMutual edges between predicted graph Gp = (Vp, Ep, `p) and goldgraph Gg = (Vg , Eg , `g), both over terminals W = {w1, . . . , wn}:

M(Gp, Gg) ={

(e1, e2) ∈ Ep×Eg∣∣∣ y(e1) = y(e2)∧`p(e1) = `g(e2)

}The yield y(e) ⊆ W of an edge e = (u, v) in either graph is the setof terminals in W that are descendants of v . ` is the edge label.

Labeled precision, recall and F-score are then defined as:

LP = |M(Gp, Gg)||Ep|

, LR = |M(Gp, Gg)||Eg |

,

LF = 2 · LP · LRLP + LR .

Two variants: one for primary edges, and another for remote edges.