Cross-Domain Generalization of Neural Constituency Parsersdfried/talks/fkk... · 2019. 8. 3. ·...

31
Cross-Domain Generalization of Neural Constituency Parsers Daniel Fried*, Nikita Kitaev*, and Dan Klein

Transcript of Cross-Domain Generalization of Neural Constituency Parsersdfried/talks/fkk... · 2019. 8. 3. ·...

  • Cross-Domain Generalization of Neural Constituency Parsers

    Daniel Fried*, Nikita Kitaev*, and Dan Klein

  • Cross-Domain Transfer

    But Coleco bounced back with the introduction of the Cabbage Patch dolls, whose sales hit $600 million in 1985.

    Penn Treebank

    Genia

    English Web Treebank

    Several of the heterogeneous clinical manifestations of systemic lupus erythematosus have been associated with specific autoantibodies.

    Where can I get morcillas in tampa bay, I will like the Argentinian type, but I will to try another please?

  • Penn Treebank Parsing by the Numbers

    Charniak, Collins

    Charniak & Johnson

    Petrov+

    Sagae & Lavie

    McClosky+

    Carreras+

    Zhang+

    Shindo+

    Zhu+

    Henderson

    Socher+

    Durrett+

    Dyer+

    Choe & Charniak

    Dyer+

    Stern+

    Liu & Zhang

    Fried+

    Kitaev & Klein

    Kitaev & Klein

    Joshi+

    Kitaev+

    89

    90

    91

    92

    93

    94

    95

    96

    2000 2002 2004 2006 2008 2010 2012 2014 2016 2018 2020

    Pen

    n T

    ree

    ban

    k F1

    YearNon-Neural Neural

  • Penn Treebank Parsing by the Numbers

    Charniak, Collins

    Charniak & Johnson

    Petrov+

    Sagae & Lavie

    McClosky+

    Carreras+

    Zhang+

    Shindo+

    Zhu+

    Henderson

    Socher+

    Durrett+

    Dyer+

    Choe & Charniak

    Dyer+

    Stern+

    Liu & Zhang

    Fried+

    Kitaev & Klein

    Kitaev & Klein

    Joshi+

    Kitaev+

    89

    90

    91

    92

    93

    94

    95

    96

    2000 2002 2004 2006 2008 2010 2012 2014 2016 2018 2020

    Pen

    n T

    ree

    ban

    k F1

    YearNon-Neural Neural

    No Structure

  • Penn Treebank Parsing by the Numbers

    Charniak, Collins

    Charniak & Johnson

    Petrov+

    Sagae & Lavie

    McClosky+

    Carreras+

    Zhang+

    Shindo+

    Zhu+

    Henderson

    Socher+

    Durrett+

    Dyer+

    Choe & Charniak

    Dyer+

    Stern+

    Liu & Zhang

    Fried+

    Kitaev & Klein

    Kitaev & Klein

    Joshi+

    Kitaev+

    89

    90

    91

    92

    93

    94

    95

    96

    2000 2002 2004 2006 2008 2010 2012 2014 2016 2018 2020

    Pen

    n T

    ree

    ban

    k F1

    YearNon-Neural Neural

    Transformers

    ELMo

    BERT

    No Structure

  • Methodology

    Non-neural:• Berkeley [Petrov and Klein 2007]

    • BLLIP [Charniak and Johnson 2005]

    • ZPar (Chinese) [Zhang and Clark 2011]

    Neural:• Self-Attentive Chart [Stern et al. 2017; Kitaev and Klein 2018]

    • In-Order Recurrent Neural Network Grammars (RNNG) [Dyer et al. 2016; Kuncoro et al. 2017; Liu and Zhang 2017]

  • Methodology

    Train Test out-of-domain

    EnglishNewswire(PTB WSJ)

    Literature (Brown)

    Biomedical (Genia)

    Web newsgroups, reviews, questions

    (EWT)

    ChineseNewswire(CTB v5)

    TV News(CTB v8)

    Web forums(CTB v8)

    Blogs(CTB v8)

    Zero-shot generalization setup:

  • Fact or Myth?

    Neural parsers transfer less reliably than non-neural?

    Pre-trained representations are most useful out-of-domain?

    Structure helps in generalization?

  • Fact or Myth?

    Neural parsers transfer less reliably than non-neural?

    Pre-trained representations are most useful out-of-domain?

    Structure helps in generalization?

  • Neural vs. Non-Neural Generalization

    90.1

    84.6

    79.177.4

    91.5

    85.9

    79.6 79.9

    91.5

    85.6

    80.379.1

    93.3

    88.0

    82.7 82.2

    75

    80

    85

    90

    95

    Penn Treebank Brown Corpus Genia English Web

    F1

    English Corpora

    Berkeley BLLIP In-Order RNNG Chart

    NeuralNon-Neural

    In-Domain Out-of-Domain

  • Neural vs. Non-Neural Generalization

    83.0

    77.2

    74.3 73.9

    83.7

    77.8

    75.774.7

    65

    70

    75

    80

    85

    CTB v5 (mostlynewswire)

    Broadcast News Web Forums Blogs

    F1

    Chinese Corpora

    ZPar In-Order RNNGNeuralNon-Neural

    In-Domain Out-of-Domain

  • Fact or Myth?

    Neural parsers transfer less reliably than non-neural?

    Pre-trained representations are most useful out-of-domain?

    Structure helps in generalization?

  • Fact or Myth?

    Neural and non-neural parsers transfer similarly.

    Pre-trained representations are most useful out-of-domain?

    Structure helps in generalization?

  • Fact or Myth?

    Neural and non-neural parsers transfer similarly.

    Pre-trained representations are most useful out-of-domain?

    Structure helps in generalization?

  • Effects of Pre-Trained Representations

    91.5

    85.6

    80.379.1

    92.1

    86.8

    81.680.5

    95.7

    93.5

    87.889.3

    75

    80

    85

    90

    95

    100

    Penn Treebank Brown Corpus Genia English Web

    F1

    English Corpora

    In-Order RNNG +Word Embeddings +BERT

    -50%

    -8%

    -7%

    -38%

    -7%

    -49%

    -8%-55%

    % Relative Error Reductions:In-Domain Out-of-Domain

  • Fact or Myth?

    Neural and non-neural parsers transfer similarly.

    Pre-trained representations are most useful out-of-domain?

    Structure helps in generalization?

  • Fact or Myth?

    Neural and non-neural parsers transfer similarly.

    Pre-training helps across domains.

    Structure helps in generalization?

  • Fact or Myth?

    Neural and non-neural parsers transfer similarly.

    Pre-training helps across domains.

    Structure helps in generalization?

  • Structured Decoding?

    We wanted more structure

    BERT

    Condition on sentence only Also condition on predicted structure

    UnstructuredSelf-Attentive Chart Parser[Stern et al. 2017, Kitaev and Klein 2018]

    ?

    We wanted more structure

    ?

    BERT

    VP

    S

    NP

    StructuredIn-Order RNNG

    [Dyer et al. 2016, Liu and Zhang 2017]

    VP

    S

    NP NP NP

  • Structure Helps More Out-of-Domain

    95.6

    93.1

    88.787.5

    95.7

    93.5

    89.3

    87.8

    85

    90

    95

    100

    Penn Treebank Brown Corpus Genia English Web

    F1

    Labeled Bracket F1 (English)

    Unstructured (Chart + BERT) Structured (In-Order + BERT)

    In-Domain Out-of-Domain

    -6%

    -5%-2%

    -2%

    % Relative Error Reductions:

  • Structure Helps More Out-of-Domain

    95.6

    93.1

    88.787.5

    95.7

    93.5

    89.3

    87.8

    85

    90

    95

    100

    Penn Treebank Brown Corpus Genia English Web

    F1

    Labeled Bracket F1 (English)

    Unstructured (Chart + BERT) Structured (In-Order + BERT)

    In-Domain Out-of-Domain

    -6%

    -5%-2%

    -2%

    % Relative Error Reductions:

  • Structure Helps with Larger Spans

    F1 by minimum span length, on English Web

    StructuredUnstructured

    Minimum span length

    Lab

    elle

    d s

    pan

    F1

  • 55.1

    49.2

    41.8

    17.5

    57.1

    52.0

    44.0

    18.0

    10

    20

    30

    40

    50

    60

    Penn Treebank Brown Corpus Genia English Web

    Exac

    t M

    atch

    %

    Exact Parse Match Accuracies (English)

    Unstructured (Chart + BERT) Structured (In-Order + BERT)

    Structure Improves Exact Match

    -4%-6%

    -4%

    -1%

    % Relative Error Reductions:

  • Fact or Myth?

    Neural and non-neural parsers transfer similarly.

    Pre-training helps across domains.

    Structure helps in generalization?

  • Fact or Myth?

    Neural and non-neural parsers transfer similarly.

    Pre-training helps across domains.

    Structure helps in domain transfer, longer spans, and whole parses.

  • Conclusions

    Neural and non-neural parsers transfer similarly.

    Pre-training helps across domains.

    Structure helps in domain transfer, longer spans, and whole parses.

  • Thank you!

    Code and models:

    Chart + BERT:In-Order RNNG + BERT: github.com/dpfried/rnng-bert

    parser.kitaev.io

  • High Accuracy Parsers Benefit NLP Systems

    Syntactic parses can improve system performance, even for neural models

    [Roth and Lapata 2016; Andreas et al. 2016; Aharoni and Goldberg 2017; Strubell et al. 2018; Swayamdipta et al. 2018; Hale et al. 2018; Kuncoro et al. 2018; Kim et al. 2019; He et al. 2019]

    Input Text

    Output

    Parsed Text

  • Past Evaluations of Generalization

    Charniak, Collins

    Charniak & Johnson

    Petrov+

    Sagae & Lavie

    McClosky+

    Carreras+

    Zhang+

    Shindo+

    Zhu+

    Henderson

    Socher+

    Durrett+

    Dyer+

    Choe & Charniak

    Dyer+

    Stern+

    Liu & Zhang

    Fried+

    Kitaev & Klein

    Kitaev & Klein

    Joshi+

    Kitaev+

    89

    90

    91

    92

    93

    94

    95

    96

    2000 2002 2004 2006 2008 2010 2012 2014 2016 2018 2020

    Pen

    n T

    ree

    ban

    k F1

    Year

    Gildea, 2001

    McClosky+, 2006

    Petrov & Klein, 2007

    Joshi+, 2018

  • Structure Helps with Larger Spans

    F1 by minimum span length, on Genia corpus

  • Effects of Pre-Trained Representations

    83.7

    77.8

    75.774.7

    85.7

    81.6

    79.478.2

    91.8

    88.487.0

    84.3

    70

    75

    80

    85

    90

    95

    CTB v5 (newswire) Broadcast News Web Forums Blogs

    F1

    Chinese Corpora

    In-Order RNNG +Word Embeddings +BERT

    -12%

    -50%

    -15%

    -47%

    -14%

    -38%

    -17%

    -48%

    % Relative Error Reductions:In-Domain Out-of-Domain