Modeling garden paths in a statistical dependency parser...

1
Modeling garden paths in a statistical dependency parser: Chinese, German, English Marisa Ferrara Boston, Zhong Chen, and John T. Hale Department of Linguistics and Languages, Michigan State University I NTRODUCTION This study differentiates between probability models that lead to garden-pathing and those that fail to do so in an incremental dependency parser. We use Dependency Grammar (Tesni` ere 1959) to describe sentence structure in terms of word-to-word connections called dependencies. We apply two sets of statistical features, state-based and non state-based, and examine each one’s usefulness for targeting garden-path analyses that ensnare human readers in three languages. Chinese, German, and English GARDEN PATHS ... These sentences are ambiguous between two interpretations: a human-preferred locally-correct analysis, indicated by the dashed arc, and a globally-correct interpretation, indicated by a solid blue arc. Main Verb-Reduced Relative Subject-object Prepositional Phrase Attachment Conjunction Propositional Relative The data were aggregated from a variety of Chinese (Hsiao and Gibson 2003), English (Bever 1970), and German (Sailer 2004; Agricola 1968) psycholinguistic and linguistic studies. ...are run through a PARSER ... Our parser is built to the specifications of Nivre 2004 with an added k-best search to implement garden-pathing (Frazier 1979). States encapsulate incremental analyses, and four possible actions can be taken to transition between states. ...that is informed by state-based and non state-based FEATURES ... The features, or statistical models, are trained on converted sentences from language-specific corpora. State-based features Non state-based features Stack3 Top 3 stack elements and in- put word. Distance Surface distance between top stack element and input word. Stack1 Top stack element and input word. Next Next input word. Top Top stack element. Position Position of input word. State-based features take into account internal parser states, while non state-based features take into account string information. ...state-based features RESULT in human-like performance Features that counsel for the human-preferred action are marked with a , while those that counsel for the globally-correct action are marked with a . State-based features Non state-based features Sentence Type Stack3 Stack1 Top Distance Next Position Chinese Main Verb-Reduced Relative Subject-Object Prepositional Phrase Attachment German Propositional Relative Clause Conjunction Prepositional Phrase Attachment English Main Verb-Reduced Relative Subject-Object Prepositional Phrase Attachment Total 6 9 6 1 3 1 This leads to a feature hierarchy that defines a distributional basis for human parsing preferences. Stack1 Stack3 Top Next Distance Position C ONCLUSION The results reveal that garden-pathing models are best implemented by parsers that attend more to parser-state information than non state-based information. R EFERENCES Agricola, E. 1968. Syntaktische Mehrdeutigkeit (Polysyntaktizit¨ at) bei der Analyse des Deutschen und des Englischen. Schriften zur Phonetik, Sprachwissenschaft und Kommunikationsforschung. Akademie Verlag, Berlin. Bever, T.G. 1970. The cognitive basis for linguistic structures. In R. Hayes (ed.), Cognition and Language Development. New York: Wiley and Sons. 277-360. Frazier, L. 1979. On Comprehending Sentences: Syntactic parsing strategies. Ph.D. Dissertation, University of Connecticut. Hsiao, F. and Gibson E. 2003. Processing relative clauses in Chinese. Cognition 90: 3-27. Nivre, J. 2004. Incrementality in Deterministic Dependency Parsing. Proceedings of the Workshop on Incremental Parsing (ACL): 50-57. Sailer, M. 2004. Propositional relative clauses in German. In Stefan M¨ uller (ed.), Proceedings of the 11th International Conference on HPSG. Stanford:CSLI. 223-243. Tesni` ere, L. 1959. ´ Elements de syntaxe structurale. Editions Klincksiek. {mferrara,chenzho3,jthale}@msu.edu http://www.msu.edu/˜mferrara/

Transcript of Modeling garden paths in a statistical dependency parser...

Page 1: Modeling garden paths in a statistical dependency parser ...conf.ling.cornell.edu/Marisa/BostonChenHale2008.pdf · Modeling garden paths in a statistical dependency parser: Chinese,

Modeling garden paths in a statistical dependency parser: Chinese, German, EnglishMarisa Ferrara Boston, Zhong Chen, and John T. Hale

Department of Linguistics and Languages, Michigan State University

INTRODUCTION

This study differentiates between probability models that lead togarden-pathing and those that fail to do so in an incrementaldependency parser. We use Dependency Grammar (Tesniere 1959)to describe sentence structure in terms of word-to-wordconnections called dependencies.

We apply two sets of statistical features, state-based and nonstate-based, and examine each one’s usefulness for targetinggarden-path analyses that ensnare human readers in threelanguages.

Chinese, German, and English GARDEN PATHS...

These sentences are ambiguous between two interpretations: ahuman-preferred locally-correct analysis, indicated by the dashedarc, and a globally-correct interpretation, indicated by a solid bluearc.

Main Verb-Reduced Relative

Subject-object

Prepositional Phrase Attachment

Conjunction

Propositional Relative

The data were aggregated from a variety of Chinese (Hsiao andGibson 2003), English (Bever 1970), and German (Sailer 2004; Agricola1968) psycholinguistic and linguistic studies.

...are run through a PARSER...

Our parser is built to the specifications of Nivre 2004 with anadded k-best search to implement garden-pathing (Frazier 1979).States encapsulate incremental analyses, and four possibleactions can be taken to transition between states.

...that is informed by state-based and nonstate-based FEATURES...The features, or statistical models, are trained on convertedsentences from language-specific corpora.

State-based features Non state-based features

Stack3Top 3 stack elements and in-put word. Distance

Surface distance between topstack element and input word.

Stack1Top stack element and inputword. Next

Next input word.

TopTop stack element.

PositionPosition of input word.

State-based features take into account internal parser states,while non state-based features take into account stringinformation.

...state-based features RESULT in human-likeperformanceFeatures that counsel for the human-preferred action are markedwith a", while those that counsel for the globally-correct actionare marked with a%.

State-based features Non state-based featuresSentence Type Stack3 Stack1 Top Distance Next Position

ChineseMain Verb-Reduced Relative " " % %

Subject-Object " " " % % %

Prepositional Phrase Attachment " " " %

GermanPropositional Relative Clause " " " " " "

Conjunction % " % % "

Prepositional Phrase Attachment " " % %

EnglishMain Verb-Reduced Relative " " " % %

Subject-Object " " " % %

Prepositional Phrase Attachment " " % % %

Total 6 9 6 1 3 1

This leads to a feature hierarchy that defines a distributional basisfor human parsing preferences.

Stack1� Stack3� Top� Next� Distance� Position

CONCLUSION

The results reveal that garden-pathing models are bestimplemented by parsers that attend more to parser-stateinformation than non state-based information.

REFERENCESAgricola, E. 1968. Syntaktische Mehrdeutigkeit (Polysyntaktizitat) bei der Analyse des

Deutschen und des Englischen. Schriften zur Phonetik, Sprachwissenschaftund Kommunikationsforschung. Akademie Verlag, Berlin.

Bever, T.G. 1970. The cognitive basis for linguistic structures. In R. Hayes (ed.), Cognition andLanguage Development. New York: Wiley and Sons. 277-360.

Frazier, L. 1979. On Comprehending Sentences: Syntactic parsing strategies. Ph.D.Dissertation, University of Connecticut.Hsiao, F. and Gibson E. 2003. Processing relative clauses in Chinese. Cognition 90: 3-27.Nivre, J. 2004. Incrementality in Deterministic Dependency Parsing. Proceedings of the

Workshop on Incremental Parsing (ACL): 50-57.Sailer, M. 2004. Propositional relative clauses in German. In Stefan Muller (ed.), Proceedings

of the 11th International Conference on HPSG. Stanford:CSLI. 223-243.Tesniere, L. 1959. Elements de syntaxe structurale. Editions Klincksiek.

{mferrara,chenzho3,jthale}@msu.edu http://www.msu.edu/˜mferrara/