Context-Free Languageshs/teach/13w/intro/pdf/08syn.pdf · CIS, LMU, 2013-11-25 Slides based on RPI...
Transcript of Context-Free Languageshs/teach/13w/intro/pdf/08syn.pdf · CIS, LMU, 2013-11-25 Slides based on RPI...
Context-Free Languages Hinrich Schütze
CIS, LMU, 2013-11-25 Slides based on RPI CSCI 2400
Thanks to Costas Busch
Take-away Definition context-free grammar Definition context-free language Derivation, sentential form, sentence Derivation trees Ambiguity Context-free grammars for natural language
2
Terminology For our purposes in this class: Context free grammar = Constituency grammar = Phrase structure grammar
3
4
Grammars Grammars express languages Example: the English language
verbpredicate
nounarticlephrasenoun
predicatephrasenounsentence
→
→
→
_
_
5
walksverb
runsverb
dognoun
catnoun
thearticle
aarticle
→
→
→
→
→
→
6
A derivation of “the dog walks”:
walksdogtheverbdogthe
verbnounthe
verbnounarticle
verbphrasenoun
predicatephrasenounsentence
⇒
⇒
⇒
⇒
⇒
⇒
__
7
A derivation of “a cat runs”:
runscataverbcata
verbnouna
verbnounarticle
verbphrasenoun
predicatephrasenounsentence
⇒
⇒
⇒
⇒
⇒
⇒
__
8
Language of the grammar:
L = { “a cat runs”, “a cat walks”, “the cat runs”, “the cat walks”, “a dog runs”, “a dog walks”, “the dog runs”, “the dog walks” }
9
Notation
dognoun
catnoun
→
→
Variable Terminal
Production Rules
10
Another Example Grammar: Derivation of sentence :
λ→
→
SaSbS
abaSbS ⇒⇒
ab
aSbS→ λ→S
11
Language?
12
aabbaaSbbaSbS ⇒⇒⇒
aSbS→ λ→S
aabb
λ→
→
SaSbSGrammar:
Derivation of sentence :
13
Other derivations:
aaabbbaaaSbbbaaSbbaSbS ⇒⇒⇒⇒
aaaabbbbaaaaSbbbbaaaSbbbaaSbbaSbS
⇒⇒
⇒⇒⇒
14
Language of the grammar
λ→
→
SaSbS
}0:{ ≥= nbaL nn
15
More Notation Grammar ( )PSTVG ,,,=
:V
:T
:S
:P
Set of variables
Set of terminal symbols
Start variable
Set of Production rules
16
Example Grammar :
λ→
→
SaSbSG
( )PSTVG ,,,=
}{SV = },{ baT =
},{ λ→→= SaSbSP
17
More Notation Sentential Form: A sentence that contains variables and terminals Example: aaabbbaaaSbbbaaSbbaSbS ⇒⇒⇒⇒
Sentential Forms sentence
18
We write: Instead of:
aaabbbS*⇒
aaabbbaaaSbbbaaSbbaSbS ⇒⇒⇒⇒
19
In general we write: If:
nww*
1 ⇒
nwwww ⇒⇒⇒⇒ 321
20
By default:
ww*⇒
21
Example
λ→
→
SaSbS
aaabbbS
aabbS
abS
S
*
*
*
*
⇒
⇒
⇒
⇒λ
Grammar Derivations
22
baaaaaSbbbbaaSbb
aaSbbS
∗
∗
⇒
⇒λ→
→
SaSbS
Grammar
Example
Derivations
23
Another Grammar Example Grammar :
λ→
→
→
AaAbAAbSG
24
Language?
25
Grammar :
λ→
→
→
AaAbAAbS
Derivations:
€
S→ Ab→ bS→ Ab→ aAbb→ abbS→ Ab→ aAbb→ aaAbbb→ aabbb
G
26
More Derivations
aaaabbbbbaaaaAbbbbbaaaAbbbbaaAbbbaAbbAbS
⇒⇒
⇒⇒⇒⇒
bbaS
bbbaaaaaabbbbS
aaaabbbbbS
nn∗
∗
∗
⇒
⇒
⇒
27
Language of a Grammar For a grammar with start variable :
GS
}:{)( wSwGL∗⇒=
String of terminals
28
Example For grammar :
λ→
→
→
AaAbAAbS
}0:{)( ≥= nbbaGL nn
Since: bbaS nn∗⇒
G
29
A Convenient Notation
λ→
→
AaAbA
λ|aAbA→
thearticleaarticle
→
→ theaarticle |→
30
Revisit first grammar A context-free grammar :
λ→
→
SaSbS
aabbaaSbbaSbS ⇒⇒⇒
G
A derivation:
31
A context-free grammar :
λ→
→
SaSbS
aaabbbaaaSbbbaaSbbaSbS ⇒⇒⇒⇒
G
Another derivation:
32
λ→
→
SaSbS
=)(GL
(((( ))))
}0:{ ≥nba nn
Describes parentheses:
33
λ→
→
→
SbSbSaSaSA context-free grammar : G
Example
34
Language?
35
λ→
→
→
SbSbSaSaS
abaabaabaSabaabSbaaSaS ⇒⇒⇒⇒
A context-free grammar : G
Another derivation:
36
λ→
→
→
SbSbSaSaS
=)(GL }*},{:{ bawwwR ∈
37
λ→
→
→
SSSSaSbSA context-free grammar : G
Example
38
Language?
39
λ→
→
→
SSSSaSbS
abababaSbabSaSbSSSS ⇒⇒⇒⇒⇒
A context-free grammar : G
Two derivations:
ababSaSbSSSS ⇒⇒⇒⇒
40
λ→
→
→
SSSSaSbS
}prefixanyin )()( and
),()(:{
vvnvn
wnwnw
ba
ba≥
==)(GL
Interpretation?
41
λ→
→
→
SSSSaSbS
}prefixanyin )()( and
),()(:{
vvnvn
wnwnw
ba
ba≥
=
() ((( ))) (( ))
=)(GL
Describes matched parentheses:
42
Definition: Context-Free Grammars
Grammar
Productions of the form: xA→String of variables and terminals
),,,( PSTVG =
Variables Terminal symbols
Start variable
Variable
43
*},:{)(*
TwwSwGL ∈⇒=
),,,( PSTVG =
44
Definition: Context-Free Languages A language is context-free if and only if there is a context-free grammar with
L
G)(GLL =
45
Derivation Order ABS→.1
λ→
→
AaaAA
.3
.2λ→
→
BBbB
.5
.4
aabaaBbaaBaaABABS54321⇒⇒⇒⇒⇒
Leftmost derivation:
aabaaAbAbABbABS32541⇒⇒⇒⇒⇒
Rightmost derivation:
46
Language?
47
λ|ABbBbAaABS
→
→
→
Leftmost derivation:
abbbbabbbbBabbBbbBabAbBabBbBaABS
⇒⇒
⇒⇒⇒⇒
Rightmost derivation:
abbbbabbBbbabAbabBbaAaABS
⇒⇒
⇒⇒⇒⇒
48
Language?
49
Derivation Trees
50
ABS⇒
ABS→ λ|aaAA→ λ|BbB→
S
BA
51
ABS→ λ|aaAA→ λ|BbB→
aaABABS ⇒⇒
a a A
S
BA
52
ABS→ λ|aaAA→ λ|BbB→
aaABbaaABABS ⇒⇒⇒S
BA
a a A B b
53
ABS→ λ|aaAA→ λ|BbB→
aaBbaaABbaaABABS ⇒⇒⇒⇒S
BA
a a A B b
λ
54
ABS→ λ|aaAA→ λ|BbB→
aabaaBbaaABbaaABABS ⇒⇒⇒⇒⇒S
BA
a a A B b
λ λ
Derivation Tree
55
aabaaBbaaABbaaABABS ⇒⇒⇒⇒⇒
yield
aabbaa
=
λλ
S
BA
a a A B b
λ λ
Derivation Tree
ABS→ λ|aaAA→ λ|BbB→
56
Partial Derivation Trees
ABS⇒
S
BA
Partial derivation tree
ABS→ λ|aaAA→ λ|BbB→
57
aaABABS ⇒⇒
S
BA
a a A
Partial derivation tree
58
aaABABS ⇒⇒
S
BA
a a A
Partial derivation tree
sentential form
yield aaAB
59
aabaaBbaaBaaABABS ⇒⇒⇒⇒⇒
aabaaAbAbABbABS ⇒⇒⇒⇒⇒S
BA
a a A B b
λ λ
Same derivation tree
Sometimes, derivation order doesn’t matter Leftmost:
Rightmost:
60
Ambiguity
61
aEEEEEE |)(|| ∗+→
aaa ∗+
E
EE
EE
+
a
a a
∗
aaaEaaEEaEaEEE*+⇒∗+⇒
∗+⇒+⇒+⇒
leftmost derivation
62
aEEEEEE |)(|| ∗+→
aaa ∗+
E
EE
+
a a
∗
EE a
aaaEaaEEaEEEEEE
∗+⇒∗+⇒
∗+⇒∗+⇒∗⇒
leftmost derivation
63
aEEEEEE |)(|| ∗+→
aaa ∗+
E
EE
+
a a
∗
EE a
E
EE
EE
+
a
a a
∗
Two derivation trees
64
The grammar aEEEEEE |)(|| ∗+→
is ambiguous:
E
EE
+
a a
∗
EE a
E
EE
EE
+
a
a a
∗
string aaa ∗+ has two derivation trees
65
string aaa ∗+ has two leftmost derivations
aaaEaaEEaEEEEEE
∗+⇒∗+⇒
∗+⇒∗+⇒∗⇒
aaaEaaEEaEaEEE*+⇒∗+⇒
∗+⇒+⇒+⇒
The grammar aEEEEEE |)(|| ∗+→
is ambiguous:
66
Definition:
A context-free grammar is ambiguous if some string has: two or more derivation trees
G
)(GLw∈
67
In other words:
A context-free grammar is ambiguous if some string has: two or more leftmost derivations
G
)(GLw∈
(or rightmost)
68
Why do we care about ambiguity?
E
EE
+
a a
∗
EE a
E
EE
EE
+
a
a a
∗
aaa ∗+
take 2=a
69
E
EE
+
∗
EE
E
EE
EE
+
∗
222 ∗+
2
2 2 2 2
2
70
E
EE
+
∗
EE
E
EE
EE
+
∗
6222 =∗+
2
2 2 2 2
2
8222 =∗+
4
2 2
2
6
2 2
24
8
71
E
EE
EE
+
∗
6222 =∗+
2
2 2
4
2 2
2
6
Correct result:
72
• We want to remove ambiguity
• Ambiguity is bad for programming languages
73
We fix the ambiguous grammar:
aEEEEEE |)(|| ∗+→
New non-ambiguous grammar:
aFEFFTFTT
TETEE
→
→
→
∗→
→
+→
)(
74
aFEFFTFTT
TETEE
→
→
→
∗→
→
+→
)(
aaaFaaFFaFTaTaTFTTTEE
∗+⇒∗+⇒∗+⇒
∗+⇒+⇒+⇒+⇒+⇒
E
E T+
T ∗ F
F
a
T
F
a
a
aaa ∗+
75
E
E T+
T ∗ F
F
a
T
F
a
a
aaa ∗+
Unique derivation tree
76
The grammar :
aFEFFTFTT
TETEE
→
→
→
∗→
→
+→
)(
is non-ambiguous:
Every string has a unique derivation tree
G
)(GLw∈
77
Another Ambiguous Grammar
IF_STMT if EXPR then STMT →| if EXPR then STMT else STMT
Ambiguity?
78
If expr1 then if expr2 then stmt1 else stmt2
IF_STMT
expr1 then
else if expr2 then
STMT
stmt1
if
IF_STMT
expr1 then else
if expr2 then
STMT stmt2 if
stmt1
stmt2
79
Inherent Ambiguity Some context free languages have only ambiguous grammars
Example: }{}{ mmnmnn cbacbaL ∪=
λ||11
aAbAAcSS
→
→
λ||22
bBcBBaSS
→
→21 | SSS→
80
The string nnn cba
has two derivation trees
S
1S
S
2S
1S c 2Sa
81
Ambiguity in natural language?
Take-away Definition context-free grammar Definition context-free language Derivation, sentential form, sentence Derivation trees Ambiguity Context-free grammars for natural language
82