Radio Galaxy workshop: so what? Ewan Mitchell David Whysong Bram Venemans Michiel Reuland.
Foundations of Linguistics & Foundations of Syntax: Basic Issues Loes Koring Iris Mulders Eric...
-
date post
19-Dec-2015 -
Category
Documents
-
view
213 -
download
0
Transcript of Foundations of Linguistics & Foundations of Syntax: Basic Issues Loes Koring Iris Mulders Eric...
Foundations of Linguistics &
Foundations of Syntax:Basic Issues
Loes Koring
Iris Mulders
Eric Reuland
Eddy Ruys
Today
• Language as a computational system• Models and interface conditions• Learnability• Starting point of all science:
Wonder (especially about the obvious)
What is Language?
Language: Systematic mapping between • Forms: events in an (external) medium (sound,
gesture, ink on paper)&• Meanings
– If you know under what conditions a sentence would be true you know its meaning
– Compositionality: The meaning of a sentence is determined on the basis of the meanings of its parts + the way these parts are combined
Big Questions
• How did language emerge? Did it further evolve?• How is language related to thought?• How is language represented in the brain?• How special is the language faculty?• How is language acquired?• What is the range of variation among languages?• Are there linguistic universals? If so, what are
they, and what do they follow from?
Language as a computational system
How (un)controversial?
• An example of a computation:– Interpretation: English null arguments
English null arguments(1) a. I wonder who the men expected to see them
b. The men expected to see them
(2) a. John ate an appleb. John ate
(3) a. John is too stubborn to talk tob. John is too stubborn to talk to Bill
(4) a. John is too clever to expect us to catchb. John is too clever to expect us to catch Bill
How to understand
Instruction:
• Fill the gap
• Interpret the result as usual
English null arguments(1) a. I wonder who the men expected (who) to see them
b. The men expected (the men) to see them
(2) a. John ate an appleb. John ate --
(3) a. John is too stubborn PRO=? to talk to (John)b. John is too stubborn PRO=? to talk to Bill
(4) a. John is too clever PRO=? to expect us to catch (John)b. John is too clever PRO=? to expect us to catch Bill
Knowledge and Use
Systematic ambiguity between:• Grammar as an abstract system defining a
mapping between form and meaning• Grammar as a system implemented in the brain
that is accessed and used in the actual computations the brain has to carry out in order to assign an interpretation to a form, or find a form for an (intended) interpretive effect issues about the relation between the grammar and the processing system
The minimal language system(Chomsky 1995)
PF interface C-I interface
Sensori- CHL Interpretation
Motor system system
Lexicon
- dedicated +dedicated(?) -dedicated
Evolutionary fable
• Given a primate with the human mental architecture and sensori-motor apparatus in place, but not yet a language organ. What specifications does some language organ FL have to meet if upon insertion it is to work properly? (Chomsky (1998:6)
Components of the C-I Interface (Reinhart 2006)
Computational System (CS)
Context
Inference
Sensori-motor systems
systwmasys
Concepts
A note on the inference system
• Propositional – concepts are not enough must be fed by CHL
• John sings and dances John dances• Every boy sings and dances Every boy
sings• No boy sings and dances -/No boy sings• We are near a gas station vs We are far
from a gas station
Tasks 1
Characterize
• Computational system (CHL)– Universal versus Language specific properties
• Lexicon– Universal versus Language specific properties
• PF-interface (relation to medium)• CI-interface (relation to thought)
Tasks 2: find map between linguistic operations and neurocognitive processesPF-interface
|Computational system of
Human Language (CHL) (+Lexicon)
|Conceptual-Intentional
Interface (C-I interface)
?
On the relation between linguistics and psycholinguistics
"The split between linguistics and psycholinguistics in the 1970’s has been interpreted as being a retreat by linguists from the notion that every operation of the grammar is a mental operation that a speaker must perform in speaking and understanding language.
But, putting history aside for the moment, we as linguists cannot take the position that there is another way to construct mental representations of sentences other than the machinery of grammar.
....There is no retreat from the strictest possible interpretation of grammatical operations as the only way to construct linguistic representations" (Alec Marantz, lecture notes 2000)
Tasks 3 and 4
• Determine how the grammar is put to use in reasoning, realizing communicative intentions, etc.
• Determine the nature and locus of cross-linguistic variation, and how the child is able to arrive at the correct grammar of the language she is exposed to.
Does function determine form?
• Reinhart 2006:
The language system (hence our theory of it) must be compatible with the functions it serves, but cannot be determined by them, since many conceivable systems could potentially serve the same functions
Levels of adequacy
• Observational adequacy (bare facts)• Descriptive adequacy/Interface adequacy
(output readable to interface)– Processing system– Inference system
• Explanatory adequacy (acquisition)• Neuro-cognitive adequacy (“beyond”
explanatory adequacy, Chomsky 2005)
Conjecture
• CHL is the optimal solution to meeting interface conditions
• A system meeting observational adequacy in the form-meaning mapping will also meet– interface adequacy
– explanatory adequacy (account for acquisition)
– and meet conditions on implementation (provide a model for language processing: a transparent parser)
Tensions between requirements
• What do we need for easy description?• What do we need for explanation?• This tension is nothing special:
Compare - Quantum physics- Newtonian mechanics
For understanding planetary motionFor understanding why there are no white holes
An example: how do we compute dependencies?
• What did John see?
• What did John see –
• What did John [ [see - ] ]
One-step process or two-step process?
Required for explanation
• What do we minimally need to account for language structure?
• What do we minimally need to assume is dedicated to language?
• Behind these questions:– What kind of elements and what type of
properties can be plausibly represented in the brain?
Grammatical system:Essential properties
Required:
• Inventory of vocabulary items/lexicon (elementary form-meaning combinations)
• Combinatory principles
• Abstract from: ‘size’ of basic vocabulary items
Definitions
• Given a vocabulary V, a language LV is a subset of V* (=the set of all strings over V)
• A grammar GL is a finite set of rules generating the strings over V that are members of L and none of the strings over V that are not members of L
• For an infinite language at least one recursive rule is required
• Recursion: Call an instruction (rule) while carrying out that instruction
A simple model
Intuitive procedure:
1. Identify the set L1 of possible first words of a sentence
2. Identify for each member i of L1 the set L2i of words by which it can be followed
3. Continue the procedure until you are done
Peripheral recursion
Peripheral: The calling of an instruction at the end of carrying out that instructionP(lan)
Realized P
Realized P … P?
Formalized as: finite state grammar Illustration (Chomsky 1957): oldThe man comes men come S The S1S1 man S3S1 men S4S3 comesS4 comeS1 old S1
A hierarchy of grammar types• i. Finite state grammars: peripheral recursion
Rule schema’s (with x, y, strings over a given vocabulary):
S xS; Sy ii. Beyond finite state grammars (context free or higher): minimally allowing embedded recursion as in:
S xSy; SxyS aSb; S ab generates anbn
iii.iv,v, vi. More expressive power (full CFG, CS, URS)
Some formal languages
The following sets are formal languages:(i) ab, abb, aaabbb, ….. (i.e. n occurrences of a followed by
n occurrences of b)(ii) aa,bb,abba,baab,bbbb,aabbaa,abbbba, …. (X followed by
the mirror image of X)(iii) aa,bb,abab,baba,aaaa,bbbb,aabaab,abbabb, …. (X followed by
X) These are not finite state languages: It is impossible to constract a finite state grammar that would generate all and only the sentences of these languagesWhy: Insufficient means to encode relevant pattern
The “language of brackets”
In every wellformed expression the number of opening brackets must equal the number of closing brackets: [n ]n
There is NO finite state grammar for the language of brackets:
S [ O1 On-1 [ On
O1 ] On ] Sn-1
O1 [ O2 …..…..To know how many closing brackets you need, you must have remembered the number of opening brackets FS grammar limits on memory
A simple bracketing grammar
S [ S ]
S []
Recursion:
• Calling an instruction while carrying out that instruction
What about human language?
S
The birds1 S are1 arriving
that the man2 S is2 listening to
I3 was3 watching S
when …..
An informal proof (Chomsky 1957)I. Let S1,S2,S3 be declarative sentences in English, thenII(i) If S1, then S2(ii) Either S3 or S4(iii) The man who said that S5 is arriving today Interdependencies: if---then, either ---- or, man --- isInterdependencies can be embedded: if, either (iii), or S4, then S5 Hence: a + S1 + b; S1 = c + S2 + d
This pattern reflects the mirror properties of (ii).Hence: No theory of linguistic structure based exclusively on Markov process models and the like will adequately reflect the competence of a speaker of English
Result
• The finite state model is inadequate as a model of natural language.
• This does not just hold for FS grammars proposed so far, but extends to any possible grammar in the set of FS grammars
• This was a result of a new kind.
Interface adequacy
FS grammars and levels of adequacy
• Observational adequacy
• Descriptive/Interface adequacy
• Explanatory adequacy
• ….
Issues of language design
• The round square dog barked (at the lazy moon)
S1
The S2
round S2
square S3
dog S4
barked
What’s wrong?
What’s wrong with mere sequentiality?
• A grammar should encode expressions in such a way that the interface with the interpretive system can read them and use them for further processing
• A grammar should encode dependencies between expressions in such a way that the interface with the interpretive system can read them and use them for further processing
• [[The round square dog]SU [barked]Pred ]Sentence
A more powerful type of grammar:Types of phrase structure grammars are defined by different restrictions on rule types. A standard context free (CF) phrase structure grammar expresses hierarchical structure and uses rules of the general form: A BC (where B and C range over the categories of the grammar,
including A) A a (where a ranges over the terminal vocabulary (lexicon)
of the language) (i) Sentence NP + VP(ii) NP D + N(iii) VP Verb + NP(iv) D the(v) N man, ball, ….(vi) Verb hit, took, ….
Structure
Sentence
NP VP
D N Verb NP
the man hit D N
the ball
The dependencies between the subparts of natural language expressions are best captured in terms of a hierarchical structure Tree diagrams represent: 1. The hierarchical grouping of parts of the sentence into
constituents2. The grammatical type of each constituent3. The left –to-right order of the constituents
Hierarchical structure
Constituent structure and interface adequacy
An analysis of the sentence into contiguous subparts such that:
• The subparts serve as the input for the computation of dependencies
• The subparts are readable to the interface(s)
Constituent structure guide lines
• Dislocation moves constituents
• Substitution observes constituency
• Dependencies obtain between constituents– Semantic role assigment– Case assignment
Issues in Learnability
Universal Grammar (UG)
A Hypothesis?
The acquisition schema
From the initial state to the final – adult – state:
S0 ------ S1 ------ S2 ---.....---- Si ---........--- Sn ------ Sn------Sn
| | | | | |
D1 D2 Di Dn Dn+1 Dn+2 ......
• Adult state: convergence - input causes no more changes.• Question: How is this possible?
A simplified version
Questions: What does a person who knows a language
know? Quite a lot….What does a person who know a language
minimally know?A person who knows a language minimally
knows what strings of words are wellformed sentences and which do not.
Question: What is the size of the set of sentences of a language? •Principled answer: Infinite; there is no longest sentence.• Practical answer: astronomical, also if one limits oneself to sentences that are not overly long.•Conjecture: The number of well-formed English sentences of 20 words and less is 1020 (Levelt 1967) Six seconds per sentence 19.000.000.000.000 years to hear and say them all. Six years of non-stop listening: the percentage of sentences heard is at most 0,000000000031%
Size
1,2,3,4,5,6,7, ….
2,4,6,8,10, ….
1,5,25,125,625,3125, ….. 0,2,6,12,20,30, …..
(1x0, 2x1,3x2,4x3, …..)
Abstract task: continue a series 1
1,5,11,19,29, …. Regularity 1:(1x2)-1, (2x3)-1, …. a(a+1)-1: (6x7)-1=41, (7x8)-1=55 Alternative (based on prime number series): 1, (2,3), 5, (7), 11, (13,17), 19, (23), 29, (31,37), 41, (43), 47,
(53, 59), 61, …..
Continue a series 2
The abstract version
• I. Consider an infinite set of which a finite subset is given. Determine the full set on the basis of this subset.
II. Consider an infinite series of elements (e.g., the natural numbers). Determine, given a finite beginning of the series how it continues.
• III. Fundamental truth: Such tasks have an infinity of solutions
How comes it that human beings, whose contacts
withthe world are brief and personal and limited,
are nevertheless able to know as much as we do
know?(Chomsky 1986)
Plato’s Problem
Tasks of the sort Complete the series are impossible in theirgenerality.
They may be possible of the type of regularity is given in advance.
For instance: i) There is a constant difference between a member of the
series and its successor.ii) For each member in the series its successor is computed by
multiplying it with a constant factor.
Restricting the hypothesis space
Complicating factors
• No negative evidence– No systematic corrections– Resistance to correction
• Non-homogeneity– Errors– Idiolectal variation
• Yet the child finds her way
Insufficient
• Analogy• Motherese• The data is so much richer• Restricted window
May be true: but make learning task harder instead of easier
Require substantive further restrictions on hypothesis space focus of current research
Classical example (McNeill (1966)
Child: Nobody don't like meMother: No, say "nobody likes me'Child: Nobody don't like me(this goes on 8 times)
Finally:Mother: No, now listen carefully; say "nobody likes me"Child: Oh, nobody don't likes me
Analogy – Negative evidence
(1) a. The members of the audience will standb. Will the members of the audience stand?
(2) a. Mary has lived in Princetonb. Has Mary lived in Princeton?
(1’)a. The members of the audience who have been enjoying themselves will stand
a. *Have the members of the audience who - been enjoying themselves will stand?
b. Will the members of the audience who have been enjoying themselves - stand?
Analogy versus structure
Minimal condition on operations in natural language
• Structure dependence
• Types of impossible operations:
- Mirroring a string
- Move the 25th word
- etc.
Puzzling observation
It is surprising that so many researchers of human learning have such a hard time accepting the following truth:
• Learning a recursive step by presentation only, without restrictions on the hypothesis space, is as impossible as creating the perpetuum mobile.