✬ ✫
✩ ✪
Is syntactic knowledge
probabilistic?
✬ ✫
✩ ✪
Experiments described in:
Anette Rosenbach (2002) Genitive Variation in English. Conceptual
Factors in Synchronic and Diachronic Studies. Mouton de Gruyter.
Anette Rosenbach (2003) Iconicity and Economy in the Choice between
the ’s-genitive and the of-genitive in English. In Determinants of Gram-
matical Variation in English, ed. by G. Rohdenburg and B. Mondorf,
Mouton de Gruyter, 379–411.
Joan Bresnan (2007) Is syntactic knowledge probabilistic? Experiments
with the English dative alternation. In Roots: Linguistics in Search of
Its Evidential Base. Series: Studies in Generative Grammar, ed. by S.
Featherston and W. Sternefeld. Berlin: Mouton de Gruyter, 77–96.
✬ ✫
✩ ✪
Rosenbach (2003) reports a forced choice study
which controls for the overlapping factors (animacy,
topicality, prototypicality of the possession relation)
that affect genitive choice:
Items and conditions:
[+animate +topical +proto]: the boy’s eyes ∼ the eyes of the boy
[+animate, +topical, −proto]: the mother’s future ∼ the future of the mother
[+animate, −topical, +proto]: a girl’s face ∼ the face of a girl
[+animate, −topical, −proto]: a woman’s shadow ∼ the shadow of a woman
[−animate, +topical, +proto]: the chair’s frame ∼ the frame of the chair
[−animate, +topical, −proto]: the bag’s contents ∼ the contents of the bag
[−animate, −topical, +proto]: a lorry’s wheels ∼ the wheels of a lorry
[−animate, −topical, −proto]: a car’s fumes ∼ the fumes of a car
✬ ✫
✩ ✪
–Operationalizes animacy as personal, common
nouns vs. concrete common nouns (excluding ge-
ograpical and temporal)
–Operationalizes topicality as second-mention,
definite expression vs. first-mention, indefinite
expression
–Operationalizes possessive relations as
for humans: body parts, kin terms, and per-
manent legal ownership vs. states and ab-
stract ‘possessions’
for inanimates: part/whole relations vs. non-
part/whole relations
✬ ✫
✩ ✪
A sample question from her questionnaire:
A helicopter waited on the nearby grass like
a sleeping insect, its pilot standing outside
with Marino. Whit, a perfect specimen of
male fitness in a black flight suit, opened
[the helicopter’s doors/ the doors of the
helicopter] to help us board.
(based on Patricia Cornwell, The Body Farm, 52)
✬ ✫
✩ ✪
’s and of genitives in English (Rosenbach 2002)
✬ ✫
✩ ✪
Other findings:
the ’s-genitive is spreading across time (older to
younger speakers) and space (younger American to
younger British speakers)
✬ ✫
✩ ✪
Note on design and analysis:
–univariable analysis (= ‘basic statistical tests’,
such as Chisquare)
–controls (e.g. holds length of possessor and pos-
sessum constant; excludes proper nouns)
–stratificational analysis (e.g. age, pp. 396–7)
✬ ✫
✩ ✪
Compare a subsequent corpus study:
Lars Hinrichs & Benedikt Szmrecsányi. 2007 Re-
cent changes in the function and frequency of Stan-
dard English genitive constructions: A multivariate
analysis of tagged corpora. English Language and
Linguistics 11(3): 437–74.
✬ ✫
✩ ✪
Bresnan (2007):
Hypothesis: If the dative corpus model sufficiently
characterizes language users’ implicit linguistic
knowledge of usage probabilities, then where the
model predicts higher- or lower-probability out-
comes, we would expect experiment participants to
do so as well in behavioral tasks.
✬ ✫
✩ ✪
• An indirect task: Rate the naturalness of the
alternatives according to your own judgments on
a numerical scale of 1 to 100.
• A direct task: Guess the choices made by the
original dialogue speakers and rate the likeli-
hood of your guess being correct on a numerical
scale of 1 to 100.
✬ ✫
✩ ✪
Bresnan 2007
Experiment:
Thirty instances of dative constructions were ran-
domly drawn from the centers of five probability
bins of the dative corpus model distribution. (Po-
tentially ambiguous items were replaced.)
✬ ✫
✩ ✪
0 5 10 15 20 25 30
0.0
0.2
0.4
0.6
0.8
1.0
Sampled Constructions for Experiment 1
Co
rpus
Mo
del
Pro
bab
ilit
ies
vlow
low
med
hi
vhi
✬ ✫
✩ ✪
The contexts of the sampled instances were re-
trieved from the full Switchboard corpus tran-
scriptions and edited for readability by removing
disfluencies and backchannelings.
• The probability model was not conditioned on
speech features (disfluency, prosody, etc)
• The experimental task required reading and not
audition.
✬ ✫
✩ ✪
• An alternative to each target construction was
constructed,
• the order of passages was randomized,
• and the order of target constructions alternated.
• A questionnaire was created containing the thirty
passages.
✬ ✫
✩ ✪
Sample passage
Speaker A:
I moved to Arkansas and Texas after living in Ohio and the
schools down here rate, you know, bottom ten percent across
the country and having been through grade school up there
and coming down here to high school I can understand why.
Because they’re so far behind and so poorly staffed, half the
time the teachers don’t know what’s going on.
Speaker B:
Well, that’s really too bad because
(1) it’s giving some people unfair advantage.
(2) it’s giving unfair advantage to some people.
✬ ✫
✩ ✪
19 participants from Stanford summer term under-
graduates were recruited and paid.a
The participants were instructed to rate the relative
naturalness of the alternatives in the given context
passage, according to their own intuitions, on a
scale of 0 to 100; the ratings of the alternatives
must sum to 100.
aThe results from participantss who had taken a syntax course were excluded, as well as
bilinguals and non-native speakers of English.
✬ ✫
✩ ✪
Finding: Both as a group and individually, partic-
ipants’ numerical ratings of the alternative dative
continuations showed a direct linear relation to the
corpus log odds of those constructions.
✬ ✫
✩ ✪
Corpus Log Odds
Rat
ings
20
40
60
80
−5 0 5
✬ ✫
✩ ✪
Corpus Log Odds
Rat
ings
0
20
40
60
80
100
−5 0 5
s1.us s2.us
−5 0 5
s3.us s4.us
−5 0 5
s5.us
s6.us s7.us s8.us s9.us
0
20
40
60
80
100
s10.us0
20
40
60
80
100
s11.us s12.us s13.us s14.us s15.us
s16.us
−5 0 5
s17.us s18.us
−5 0 5
0
20
40
60
80
100
s19.us
✬ ✫
✩ ✪
Analysis using multilevel multivariable regression
showed:
the corpus model probabilities are significant pre-
dictors of the ratings, after controlling for random
effects of subject and verb as well as item order,
order of constructions, and lemma frequency.
✬ ✫
✩ ✪
Bresnan (2007) also compared each subject’s rat-
ings with the actual choices by the speakers in the
original conversations. Baseline = 0.57.
Proportions of Participants’ Ratings
Favoring Actual Corpus Choices
0.63 0.83 0.80 0.70
0.80 0.80 0.67 0.77
0.73 0.83 0.80 0.77
0.80 0.77 0.77 0.73
0.73 0.87 0.67
✬ ✫
✩ ✪
Participants naturalness ratings are reliably asso-
ciated with the syntactic alternatives used by the
original speakers:
(Wilcoxon signed rank test with continuity correc-
tion, n = 19, V = 190, p < 0.001)
✬ ✫
✩ ✪
In a follow-up experiment different participants
were asked to guess which choice the original
speaker made, and to rate the likelihood that their
guess was correct. These likelihood ratings were
highly significant—
they could make reliable guesses about which alter-
native the original dialogue participant chose
(Wilcoxon signed rank test with continuity correc-
tion, n = 20, V = 210, p < 0.0001)
✬ ✫
✩ ✪
Related work:
MacDonald, M.C. (1999). Distributional information in language and
acquisition: Three puzzles and a moral. The emergence of language, ed.
by Brian MacWhinney, 177–96. Mahwah, NJ: Lawrence Erlbaum.
Arnold, J., Wasow,T., Losongco, A., & Ginstrom, R. (2000). Heaviness
vs. newness: The effects of complexity and information structure on
constituent ordering. Language 76(1): 28–55.
Gries, S.T. (2003). Towards a corpus-based identification of prototypical
instances of constructions. Annual Review of Cognitive Linguistics 1:
1–27.
Rosenbach, A. (2003). Aspects of iconicity and economy in the choice
between the s-genitive and the of -genitive in English. Determinants of
grammatical variation in English, ed.. by Günter Rohdenburg and Britta
Mondorf, 379–411. Berlin: Mouton de Gruyter.
Rosenbach, A. (2005). Animacy versus weight as determinants of
grammatical variation in English. Language 81(3): 613–44.
✬ ✫
✩ ✪
Gahl, S., & Garnsey, S. (2004). Knowledge of grammar, knowledge of
usage: Syntactic probabilities affect pronunciation variation. Language
80(4): 748–774.
Levy, R., & Jaeger, T. (2007). Speakers optimize information density
through syntactic reduction. Proceedings of the twentieth annual confer-
ence on neural information processing systems, pp. 29–37. Vancouver:
NIPS.
Bresnan, J. (2007) Is syntactic knowledge probabilistic? Experiments
with the English dative alternation. In Roots: Linguistics in Search of
Its Evidential Base. Series: Studies in Generative Grammar, ed. by S.
Featherston and W. Sternefeld. Berlin: Mouton de Gruyter, 77–96.
Tily, H., Gahl, S., Arnon, I., Snider, N., Kothari, A., & Bresnan, J. (2009).
Syntactic probabilities affect pronunciation variation in spontaneous
speech. Language and Cognition 1(2): 147–165.
Jaeger, T. F. (2010). Redundancy and reduction: Speakers manage
syntactic information density. Cognitive Psychology 61: 23–62.
✬ ✫
✩ ✪
Bresnan, J. & M. Ford (2010). Predicting syntax: Processing dative
constructions in American and Australian varieties of English, Language
86.1: 168–213.
Victor Kuperman and Joan Bresnan (2012). The effects of construction
probability on word durations during spontaneous incremental sentence
production. Journal of Memory & Language 66: 588-611.
Theijssen, D. (2012) Making Choices. Modeling the English Dative
Alternation. Nijmegen: Radboud University Centre for Language Studies
Ph.D. dissertation.
Ford, M., & Bresnan,J. (2012). “They whispered me the answer” in
Australia and the US: A comparative experimental study. In From Quirky
Case to Representing Space: Papers in Honor of Annie Zaenen, ed. by
Tracy Holloway King and Valeria de Paiva. Stanford: CSLI Publications.
Ford, M. & J. Bresnan (2013). Generating data as a proxy for unavailable
corpus data: the contextualized sentence completion task. To appear in
Corpus Linguistics and Linguistic Theory.
Top Related