A shorted version from: Anastasia Berdnikova & Denis Miretskiy.
-
Upload
barnard-tucker -
Category
Documents
-
view
223 -
download
0
Transcript of A shorted version from: Anastasia Berdnikova & Denis Miretskiy.
![Page 1: A shorted version from: Anastasia Berdnikova & Denis Miretskiy.](https://reader036.fdocuments.in/reader036/viewer/2022082516/56649d6e5503460f94a4f331/html5/thumbnails/1.jpg)
Transformational grammars
A shorted version from:Anastasia Berdnikova
&Denis Miretskiy
![Page 2: A shorted version from: Anastasia Berdnikova & Denis Miretskiy.](https://reader036.fdocuments.in/reader036/viewer/2022082516/56649d6e5503460f94a4f331/html5/thumbnails/2.jpg)
Transformational grammars 2
‘Colourless green ideas sleep furiously’. Chomsky constructed finite formal machines
– ‘grammars’. ‘Does the language contain this sentence?’
(intractable) ‘Can the grammar create this sentence?’ (can be answered).
TG are sometimes called generative grammars.
Introduction
![Page 3: A shorted version from: Anastasia Berdnikova & Denis Miretskiy.](https://reader036.fdocuments.in/reader036/viewer/2022082516/56649d6e5503460f94a4f331/html5/thumbnails/3.jpg)
Transformational grammars 3
TG = ( {symbols}, {rewriting rules α→β - productions} ) {symbols} = {nonterminal} U {terminal} α contains at least one nonterminal, β – terminals and/or
nonterminals. S → aS, S → bS, S → e (S → aS | bS | e) Derivation: S=>aS=>abS=>abbS=>abb. Parse tree: root – start nonterminal S, leaves – the
terminal symbols in the sequence, internal nodes are nonterminals.
The children of an internal node are the productions of it.
Definition
![Page 4: A shorted version from: Anastasia Berdnikova & Denis Miretskiy.](https://reader036.fdocuments.in/reader036/viewer/2022082516/56649d6e5503460f94a4f331/html5/thumbnails/4.jpg)
Transformational grammars 4
W – nonterminal, a – terminal, α and γ –strings of nonterminals and/or terminals including the null string, β – the same not including the null string.
regular grammars: W → aW or W → a context-free grammars: W → β context-sensitive grammars: α1Wα2 →
α1βα2. AB → BA unrestricted (phase structure)
grammars: α1Wα2 → γ
The Chomsky hierarchy
![Page 5: A shorted version from: Anastasia Berdnikova & Denis Miretskiy.](https://reader036.fdocuments.in/reader036/viewer/2022082516/56649d6e5503460f94a4f331/html5/thumbnails/5.jpg)
Transformational grammars 5
The Chomsky hierarchy
![Page 6: A shorted version from: Anastasia Berdnikova & Denis Miretskiy.](https://reader036.fdocuments.in/reader036/viewer/2022082516/56649d6e5503460f94a4f331/html5/thumbnails/6.jpg)
Transformational grammars 6
Each grammar has a corresponding abstract computational device – automaton.
Grammars: generative models, automata: parsers that accept or reject a given sequence.
- automata are often more easy to describe and understand than their equivalent grammars.
- automata give a more concrete idea of how we might recognise a sequence using a formal grammar.
Automata
![Page 7: A shorted version from: Anastasia Berdnikova & Denis Miretskiy.](https://reader036.fdocuments.in/reader036/viewer/2022082516/56649d6e5503460f94a4f331/html5/thumbnails/7.jpg)
Transformational grammars 7
---------------------------------------------------Grammar Parsing automaton---------------------------------------------------regular grammars finite state automatoncontext-free grammars push-down automatoncontext-sensitive grammars linear bounded
automatonunrestricted grammars Turing machine---------------------------------------------------
Parser abstractions associated with the hierarchy of grammars
![Page 8: A shorted version from: Anastasia Berdnikova & Denis Miretskiy.](https://reader036.fdocuments.in/reader036/viewer/2022082516/56649d6e5503460f94a4f331/html5/thumbnails/8.jpg)
Transformational grammars 8
W → aW or W → a sometimes allowed: W → e RG generate sequence from left to right
(or right to left: W → Wa or W → a) RG cannot describe long-range correlations
between the terminal symbols (‘primary sequence’)
Regular grammars
![Page 9: A shorted version from: Anastasia Berdnikova & Denis Miretskiy.](https://reader036.fdocuments.in/reader036/viewer/2022082516/56649d6e5503460f94a4f331/html5/thumbnails/9.jpg)
Transformational grammars 9
An example of a regular grammar that generates only strings of as and bs that have an odd number of as:
start from S,S → aT | bS,T → aS | bT | e.
An odd regular grammar
![Page 10: A shorted version from: Anastasia Berdnikova & Denis Miretskiy.](https://reader036.fdocuments.in/reader036/viewer/2022082516/56649d6e5503460f94a4f331/html5/thumbnails/10.jpg)
Transformational grammars 10
One symbol at a time from an input string. The symbol may be accepted => the
automaton enters a new state. The symbol may not be accepted => the
automaton halts and reject the string. If the automaton reaches a final ‘accepting’
state, the input string has been succesfully recognised and parsed by the automaton.
{states, state transitions of FSA}{nonterminals, productions of corresponding grammar}
Finite state automata
![Page 11: A shorted version from: Anastasia Berdnikova & Denis Miretskiy.](https://reader036.fdocuments.in/reader036/viewer/2022082516/56649d6e5503460f94a4f331/html5/thumbnails/11.jpg)
Transformational grammars 11
RG cannot describe language L when: L contains all the strings of the form aa, bb,
abba, baab, abaaba, etc. (a palindrome language).
L contains all the strings of the form aa, abab, aabaab (a copy language).
What a regular grammar can’t do
![Page 12: A shorted version from: Anastasia Berdnikova & Denis Miretskiy.](https://reader036.fdocuments.in/reader036/viewer/2022082516/56649d6e5503460f94a4f331/html5/thumbnails/12.jpg)
Transformational grammars 12
Regular language: a b a a a b Palindrome language: a a b b a a
Copy language: a a b a a b
Palindrome and copy languages have correlations between distant positions.
![Page 13: A shorted version from: Anastasia Berdnikova & Denis Miretskiy.](https://reader036.fdocuments.in/reader036/viewer/2022082516/56649d6e5503460f94a4f331/html5/thumbnails/13.jpg)
Transformational grammars 13
The reason: RNA secondary structure is a kind of palindrome language.
The context-free grammars (CFG) permit additional rules that allow the grammar to create nested, long-distance pairwise correlations between terminal symbols.
S → aSa | bSb | aa | bbS => aSa => aaSaa => aabSbaa => aabaabaa
Context-free grammars
![Page 14: A shorted version from: Anastasia Berdnikova & Denis Miretskiy.](https://reader036.fdocuments.in/reader036/viewer/2022082516/56649d6e5503460f94a4f331/html5/thumbnails/14.jpg)
Transformational grammars 14
The parsing automaton for CFGs is called a push-down automaton.
A limited number of symbols are kept in a push-down stack.
A push-down automaton parses a sequence from left to right according to the algorithm.
The stack is initialised by pushing the start nonterminal into it.
The steps are iterated until no input symbols remain.
If the stack is empty at the end then the sequence has been successfully parsed.
Push-down automata
![Page 15: A shorted version from: Anastasia Berdnikova & Denis Miretskiy.](https://reader036.fdocuments.in/reader036/viewer/2022082516/56649d6e5503460f94a4f331/html5/thumbnails/15.jpg)
Transformational grammars 15
Pop a symbol off the stack. If the poped symbol is nonterminal:
- Peek ahead in the input from the current position and choose a valid production for the nonterminal. If there is no valid production, terminate and reject the sequence.- Push the right side of the chosen production rule onto the stack, rightmost symbols first.
If the poped symbol is a terminal: - Compare it to the current symbol of the input.
If it matches, move the automaton to the right on the input (the input symbol is accepted). If it does not match, terminate and reject the sequence.
Algorithm: Parsing with a push-down automaton
![Page 16: A shorted version from: Anastasia Berdnikova & Denis Miretskiy.](https://reader036.fdocuments.in/reader036/viewer/2022082516/56649d6e5503460f94a4f331/html5/thumbnails/16.jpg)
Transformational grammars 16
Copy language: cc, acca, agaccaga, etc. initialisation:
S → CW terminal generation:
nonterminal generation: CA → aCW → AÂW | GĜW | C CG → gCnonterminal reordering: ÂC → CaÂG → GÂ ĜC → CgÂA → AÂ termination:ĜA → AĜ CC → ccĜG → GĜ
Context-sensitive grammars
![Page 17: A shorted version from: Anastasia Berdnikova & Denis Miretskiy.](https://reader036.fdocuments.in/reader036/viewer/2022082516/56649d6e5503460f94a4f331/html5/thumbnails/17.jpg)
Transformational grammars 17
A mechanism for working backwards through all possible derivations:
either the start was reached, or valid derivation was not found.
Finite number of possible derivations to examine.
Abstractly: ‘tape’ of linear memory and a read/write head.
The number of possible derivations is exponentially large.
Linear bounded automaton
![Page 18: A shorted version from: Anastasia Berdnikova & Denis Miretskiy.](https://reader036.fdocuments.in/reader036/viewer/2022082516/56649d6e5503460f94a4f331/html5/thumbnails/18.jpg)
Transformational grammars 18
Nondeterministic polynomial problems: there is no known polynomial-time algorithm for finding a solution, but a solution can be checked for correctness in polynomial time. [Context-sensitive grammars parsing.]
A subclass of NP problems - NP-complete problems. A polynomial time algorithm that solves one NP-complete problem will solve all of them. [Context-free grammar parsing.]
NP problems and ‘intractability’
![Page 19: A shorted version from: Anastasia Berdnikova & Denis Miretskiy.](https://reader036.fdocuments.in/reader036/viewer/2022082516/56649d6e5503460f94a4f331/html5/thumbnails/19.jpg)
Transformational grammars 19
Left and right sides of the production rules can be any combinations of symbols.
The parsing automaton is a Turing machine.
There is no general algorithm for determination whether a string has a valid derivation in less than infinite time.
Unrestricted grammars and Turing machines