Redundancy and reduction: Speakers manage syntactic information density

44
Redundancy and reduction: Speakers manage syntactic information density Torsten Jachmann 16.05.2014 T. Florian Jaeger (2010) Seminar „Information Theoretic Approaches to the Study of Language “

description

Redundancy and reduction: Speakers manage syntactic information density. Seminar „Information Theoretic Approaches to the Study of Language “. T. Florian Jaeger (2010) . Torsten Jachmann 16.05.2014. So far. Frequent words have shorter linguistic forms ( Zipf ) Orthographic; PHONOLOGICAL - PowerPoint PPT Presentation

Transcript of Redundancy and reduction: Speakers manage syntactic information density

Page 1: Redundancy and reduction: Speakers manage syntactic information density

Redundancy and reduction: Speakers manage syntactic

information density

Torsten Jachmann

16.05.2014

T. Florian Jaeger (2010)

Seminar „Information Theoretic Approaches to the Study of Language “

Page 2: Redundancy and reduction: Speakers manage syntactic information density

So far• Frequent words have shorter linguistic forms

(Zipf)o Orthographic; PHONOLOGICAL

• Word length (phonemes/syllables) correlated with predictability

• Information is context dependento The more probable, the more redundant

• More predictable instances of the same word are produced shorter and with less phonological and phonetic detail

Page 3: Redundancy and reduction: Speakers manage syntactic information density

IdeaSpeakers manage the amount of information per

amount of linguistic signal (at choice points)

• Morphosyntactic:o E.g. auxiliary contractions

“he is” vs. “he’s”

Page 4: Redundancy and reduction: Speakers manage syntactic information density

IdeaSpeakers manage the amount of information per

amount of linguistic signal (at choice points)

• Syntactic:o E.g. optional that-mentioning

“This is the friend I told you about”vs.

“This is the friend that I told you about”

Page 5: Redundancy and reduction: Speakers manage syntactic information density

IdeaSpeakers manage the amount of information per

amount of linguistic signal (at choice points)

• Elide constituents:o E.g. optional argument and adjunct omission

“I already ate”vs.

“I already ate dinner”

Page 6: Redundancy and reduction: Speakers manage syntactic information density

IdeaSpeakers manage the amount of information per

amount of linguistic signal (at choice points)

• Production planning:o E.g. one or more clauses

“Next move the triangle over there”vs.

“Next take the triangle and move it over there”

Page 7: Redundancy and reduction: Speakers manage syntactic information density

IdeaSpeakers manage the amount of information per

amount of linguistic signal (at choice points)

• Other languages?o German:

“Er hat es verstanden” vs. “Er hat’s verstanden”(he understood it)

o Japanese:“ 行ってはダメ” vs. “ 行っちゃダメ”

Itte ha dame Iccha dame(you can’t go)

Page 8: Redundancy and reduction: Speakers manage syntactic information density

IdeaSpeakers manage the amount of information per

amount of linguistic signal (at choice points)

The form with less linguistic signal should be less preferred whenever the reducible unit encodes a

lot of information (in the context)

Page 9: Redundancy and reduction: Speakers manage syntactic information density

Uniform Information Density(UID)

Optimal:• On average each word adds the same amount of

information to what we already know• The rate of information transfer is close to the

channel capacity

❌Many constraints (grammar; learnability)

Page 10: Redundancy and reduction: Speakers manage syntactic information density

Uniform Information Density(UID)

Efficient:• Relative uniform information distribution where

possible• No continuous under- or overutilization of the

channel

?

Page 11: Redundancy and reduction: Speakers manage syntactic information density

UIDDefinitions:• Information density:

Information per time(articulatory detail is left out)

• Choice:Subconscious(existence of different ways to encode the intended message)

Page 12: Redundancy and reduction: Speakers manage syntactic information density

UIDExample:

Page 13: Redundancy and reduction: Speakers manage syntactic information density

UIDExample:

Page 14: Redundancy and reduction: Speakers manage syntactic information density

UIDGoals:• UID as a computational account of efficient

sentence production• Corpus-based studies are feasible and desirable

Corpus of spontaneous speech Naturally distributed data

Page 15: Redundancy and reduction: Speakers manage syntactic information density

Data• 7369 automatically extracted complement

clauses (CC) from “Paraphrase Stanford-Edinburgh LINK Switchboard Corpus” (Penn Treebank)

• - 144 (2%) falsely extracted

• - 71 (1%) rare matrix verbs extreme probabilities

Page 16: Redundancy and reduction: Speakers manage syntactic information density

DataFocusActually:I(CC onset|context) = -log p(CC|context) + -log p(onset|context,CC)

Here:I(CC|context) = -log p(CC|matrix verb lemma)

Page 17: Redundancy and reduction: Speakers manage syntactic information density

Data

Page 18: Redundancy and reduction: Speakers manage syntactic information density

Multilevel logit model

• Various factors (might) influence the outcome• Ability to include several (control) parameters in

one model• Contribution of each can be estimated

Why?• Natural (uncontrolled) data

Page 19: Redundancy and reduction: Speakers manage syntactic information density

ControlsDependency• Distance of matrix verb from CC onset “THAT”

o My boss thinks [I’m absolutely crazy.]o I agree with you [that, that a person’s heart can be changed.]

• Length of CC onset (including subject) “THAT”

• Length of CC remainder

Page 20: Redundancy and reduction: Speakers manage syntactic information density

Short sidetrackLength of CC remainder

• Language production is incremental (+ heuristic complexity estimates?)

Page 21: Redundancy and reduction: Speakers manage syntactic information density

ControlsAvailability

• Lower speech rate “THAT”

• preceding pause “THAT”

• initial disfluency “THAT”

Page 22: Redundancy and reduction: Speakers manage syntactic information density

ControlsAvailability• Type of CC subject

o It vs. Io Other PRO vs. aboveo Other NP vs. above

• Frequency of CC subject head• Subject identity

o Identical subject in matrix and CC ≈ “NONE”

Page 23: Redundancy and reduction: Speakers manage syntactic information density

ControlsAvailability• Word form similarity

o Demonstrative pronoun “that”o Demonstrative determiner “that”≈ “NONE”

• Frequency of matrix verbo Higher frequency “NONE”

Page 24: Redundancy and reduction: Speakers manage syntactic information density

ControlsAmbiguity avoidance• Possible garden path sentence “THAT” ❌

o Even unlikely cases were included“I guess (this doesn’t really have to do with…)”

Page 25: Redundancy and reduction: Speakers manage syntactic information density

ControlsMatrix• Position of matrix verb

o Further away from sentence-initial position “THAT”

• Matrix subjecto You vs. I “THAT”o Other PRO vs. above “THAT”o Other NP vs. above “THAT”

Page 26: Redundancy and reduction: Speakers manage syntactic information density

ControlsOthers• Random speakers intercept• Persistence

o Prime w/o that vs. no primeo Prime w/ that vs. above

• Gendero Male “NONE”

Page 27: Redundancy and reduction: Speakers manage syntactic information density

Information density• Clear significance (p < .0001)• High information density of the CC onset

use of “that”• Correlation with other predictors negligible• Contribution to the model’s likelihood is high• At least 15% of the model quality due to

information density• Single most important predictor

Page 28: Redundancy and reduction: Speakers manage syntactic information density

Information density

• Verbs’ subcategorization frequency as estimate for information density

• High CC-biases, low “that”-biases (e.g.: guess)• Low CC-biases, high “that”-biases (e.g.: worry)

Syntactic reduction is affected by information density

Page 29: Redundancy and reduction: Speakers manage syntactic information density

Results

Page 30: Redundancy and reduction: Speakers manage syntactic information density

Information density• Prediction:

UID can account for any type of reduction

Phonetic and phonological reduction• So far patterns align with this prediction• Availability account do not predict this

o But predict lengthening of words

Page 31: Redundancy and reduction: Speakers manage syntactic information density

Information density

Optional case markers (or copula)

• Languages with flexible word order• Japanese

ケーキが大好きだ  vs.  ケーキ大好きKeeki ga daisuki da keeki daisuki

I love cake

Page 32: Redundancy and reduction: Speakers manage syntactic information density

Information density

Reduced case markers

• Korean나는 독일 사람이야  vs.  난 독일 사람이야

Na neun togil saram iya Nan togil saram iya

I am German

Page 33: Redundancy and reduction: Speakers manage syntactic information density

Information density

Optional object clitics and other argument marking morphology

• Direct object clitics in Bulgariano Can’t be predicted by availability accounto Could be predicted by ambiguity avoiding

Page 34: Redundancy and reduction: Speakers manage syntactic information density

Information density

Contracted auxiliaries

• English“he’s” vs. “he is”

o Can’t be predicted by neither availability nor ambiguity avoidance

Page 35: Redundancy and reduction: Speakers manage syntactic information density

Information density

Ellipsis• Japanese

行きたいけど行けない  vs.  行きたいけどikitai kedo ikenai ikitai kedo

I want to go but (I can’t go)

(¬ 行きたいけど ( 遅くなりそう ) I want to go, but I might be late) ikitai kedo osoku nari sou

Page 36: Redundancy and reduction: Speakers manage syntactic information density

Information density

Non-subject-extracted relative clause• Indefinite noun phrase < definite noun phrases• Light head nouns < heavy head nouns

(e.g. the way) (e.g. the priest)

“I like the way (that) it vibrates”

Page 37: Redundancy and reduction: Speakers manage syntactic information density

Information density

Whiz-deletion (BE)• Relativizer + auxiliary can be ommitted

“The smell (that is) released by a pig or a chicken farm is indescribable”

Page 38: Redundancy and reduction: Speakers manage syntactic information density

Information density

Object drop• Verbs with high selectional preference

“Tom ate.”vs.

“Tom saw …”

Page 39: Redundancy and reduction: Speakers manage syntactic information density

Information density• Many novel predictions across

oDifferent levels of linguistic productionso Languageso Types of alternations

• Per-word entropy of sentences should stay constant throughout discourseo Words with high information density (in the context and

discourse) should come later in the sentenceo A priori per-word entropy should increase

Page 40: Redundancy and reduction: Speakers manage syntactic information density

Grammaticalization• Might interfere with information density?

o Matrix subject “I” or “you”o Matrix verb “guess”, “think”, “say”, “know”, “mean”o Matrix verb in present tenseo Matrix clause was not embedded3033 cases remain

• Still highly significant (p < .0001)

UID may be a reason for grammaticalization

Page 41: Redundancy and reduction: Speakers manage syntactic information density

Noisy channel

• Base of UID• Audience design

o Speaker considers interlocutors’ knowledge and processor state to improve chance of successfully achieving their goal

• Modulating information density at choice points = rational strategy for efficient production

• UID minimizes processing difficulties

Page 42: Redundancy and reduction: Speakers manage syntactic information density

Corpus-based researchClaim: “Lack of balance and heterogeneity of data make findings unreliable”• Multilevel models• Avoidance of redundant predictors• If redundant residualization• Inter-speaker variance

+ ecological valid

Page 43: Redundancy and reduction: Speakers manage syntactic information density

Corpus-based research• Results easier extend to all of English• Many previous results replicate• Provides evidence for so far relatively

understudied effects (e.g. similarity avoidance)

• “effect size” needs to be taken with cautiono Not only strength of effects but also applicability

Ambiguity avoidance (garden path sentences) relatively rare

Page 44: Redundancy and reduction: Speakers manage syntactic information density

Questions and discussion