Predicting Fine-Grained Syntactic Typology from Surface...

Predicting Fine-Grained Syntactic Typology fromSurface Features

Abstract of Wang and Eisner (2017), previously published in TACL journal & presented at ACL 2017Dingquan Wang and Jason Eisner

{wdd,eisner}@jhu.eduWe are motivated by the longstandingchallenge of determining the structure ofa language from its superficial features.Principles & Parameters theory(Chomsky, 1981) hypothesized thathuman babies are born with anevolutionarily tuned system that isspecifically adapted to natural language,which can predict typological properties(“parameters”) by spotting telltaleconfigurations in purely linguistic input(Gibson and Wexler, 1994). Here weinvestigate whether such configurationseven exist, by asking an artificial systemto find them.

Fine-Grained Syntactic Typology

The Galactic Dependencies Treebanks

• More than 50,000 synthetic languages• Resemble real languages, but not found on Earth

• Each has a corpus of dependency parses• In the Universal Dependencies format

• Vertices are words labeled with POS tags

• Edges are labeled syntactic relationships

• Provide train/dev/test splits, alignments, tools

Results

Scatterplots by language of predicted (y-axis) vs. true (x-axis) directionality. The first 3 plots show predictions by our full system onsome of the relations. The 4th shows how performance degrades without the use of synthetic data to illustrate the surface wordorder of postpositional languages.

Surface Cues to Structure

Triggers for Principles& Parameters

Architecture

1. How often do NOUNs tend to appear shortlybefore or after VERBs?

2. How often do ADJs tend to appear shortlybefore or after NOUNs?

3. How often do ADPs tend to appear shortlybefore or after NOUNs?

Hand-Engineered features:

Neural features:

Cross-validation loss broken down by relation. We plot eachrelation r with x coordinate = the proportion of r in theaverage training corpus, and with y coordinate = theweighted average ε-insensitive loss.

ε-insensitive loss

The y coordinate is the average loss of our model. The xcoordinate is the average loss of a simple baseline modelthat ignores the input corpus.

Our final comparison on the 15 test languages (boldfaced).We ask whether the average expected loss on these 15 realtarget languages is reduced by augmenting the training poolof 20 UD treebanks with +20*21*21 GD languages. Forcompleteness, we extend the table with the cross-validationresults on the training pool. The “Avg.” lines report theaverage of 15 test or 31 training+testing languages. Wemark both “+GD” averages with “*” as they are significantlybetter than their “UD” counterparts (paired permutation testby language, p < 0.05).

Cross-validation average expected loss of the two grammarinduction methods, MS13 (Mareček and Straka, 2013) andN10 (Naseem et al., 2010), compared to the “expectedcount” (EC) heuristic and our approach. In theseexperiments, the dependency relation types are orderedPOS pairs. N10 harnesses prior linguistic knowledge, but itsimprovement upon MS13 is not statistically significant. Bothgrammar induction systems are significantly worse than therest of the systems, including even our 2 baseline systems,namely EC and ∅ (the no-feature baseline system).

Predicting Fine-Grained Syntactic Typology from Surface...

Documents

Transcript of Predicting Fine-Grained Syntactic Typology from Surface...

Fine-grained or coarse-grained? Strategies for ...

Towards a structural typology of verb classes · Towards a structural typology of verb classes 3 syntactic patterns is called ‘argument ... I→you portmanteau suffixes in Hungarian

Prosodic Typology, the second volume (New York: Oxford ... · Jun (1998) calls the ‘intonational phonological approach’ (as opposed to the ‘syntactic approach’), in which

Trajectoire: A methodological tool for eliciting Path of motion...& Vittrant (2011; 2016) propose a more fine-grained typology of (attested) constructions that keeps track of the locus

Lexical typology: introduction - Max Planck Society · terms of word classes? Are there morphological peculiarities characteristic for body-part terms? What syntactic constructions

A Semantic Approach to English Grammar, 2nd Ed (Oxford ... · PDF file‘Passive in the world’s languages’, pp. 243–81 of Language typology and syntactic description, vol. 1,

A syntactic typology of topic, focus and contrast · A syntactic typology of topic, focus and contrast * Ad Neeleman, Elena Titov, ... information structure that can be enriched to

Brief Syntactic Typology of Philippine Languages--final version-- 6 … · 2017-02-20 · the label “Philippine Type Language” has sometimes appeared in the literature to characterize

LI 7843 Linguistic Typology (John Saeed) Hilary Term · PDF fileLI 7843 Linguistic Typology (John Saeed) ... Typology and Universals. (2nd ed.) ... Language Typology and Syntactic

Introducing Language Typology - Assetsassets.cambridge.org › 97805211 › 93405 › frontmatter › 9780521193… · Introducing Language Typology Language typology identifies similarities

stance Page 1 - University of Oulu · PDF filestance_ Page 1 Abraham, Werner, ... Language Typology and Syntactic Description. Vol. 1. Cambridge: Cambridge University Press. 62–154

Non-standard evidence in syntactic typology â€“ Methodological

Conceptual Typology of Resistance - SOC · the ARIS team developed this conceptual typology of resistance (here - inafter called “the typology” or “ARIS typology”). This effort

Literatur - Springer978-3-322-92502-2/1.pdf · Literatur Ahlsen, E. & Dravins, ... typology and syntactic description. ... Temporal constraints on language processing: Syntactic priming

Language Typology and Syntactic Descriptionassets.cambridge.org/97805215/81585/frontmatter/...2 Tense 304 3 Mood and modality 315 4 Aspect, tense, and modality, in text and in general

Hieber - An Introduction to Typology, Part I: Morphological Typology

Syntactic and non-syntactic sources of interference by music on … · Syntactic and non-syntactic sources of interference by music on language processing Anna Fiveash1,2,3, Genevieve

Introduction to syntactic typology: Clause structure

рогра Web viewGreenberg J. H. et al. (eds.), Universals of human language. Vol. 3: Word structure. Stanford: ... Shopin, Timothy (ed.) Language typology and syntactic description

Fine-Grained Prediction of Syntactic Typology: Discovering ...jason/papers/wang+eisner.tacl17.pdf · Fine-Grained Prediction of Syntactic Typology: Discovering Latent Structure with