Transformational grammars
description
Transcript of Transformational grammars
Transformational grammarsTransformational grammars
Anastasia Berdnikova
&
Denis Miretskiy
Transformational grammars 2
OverviewOverview
Transformational grammars – definition Regular grammars Context-free grammars Context-sensitive grammars Break Stochastic grammars Stochastic context-free grammars for sequence
modelling
Transformational grammars 3
Why transformational grammars?Why transformational grammars?
The 3-dimensional folding of proteins and nucleic acids
Extensive physical interactions between residues
Chomsky hierarchy of transformational grammars [Chomsky 1956; 1959]
Application to molecular biology [Searls 1992; Dong & Searls 1994; Rosenblueth et al. 1996]
Transformational grammars 4
IntroductionIntroduction
‘Colourless green ideas sleep furiously’.Chomsky constructed finite formal
machines – ‘grammars’.‘Does the language contain this sentence?’
(intractable) ‘Can the grammar create this sentence?’ (can be answered).
TG are sometimes called generative grammars.
Transformational grammars 5
DefinitionDefinition
TG = ( {symbols}, {rewriting rules α→β - productions} )
{symbols} = {nonterminal} U {terminal}α contains at least one nonterminal, β –
terminals and/or nonterminals.S → aS, S → bS, S → e (S → aS | bS | e)Derivation: S=>aS=>abS=>abbS=>abb.
Transformational grammars 6
The Chomsky hierarchyThe Chomsky hierarchy
W – nonterminal, a – terminal, α and γ –strings of nonterminals and/or terminals including the null string, β – the same not including the null string.
regular grammars: W → aW or W → acontext-free grammars: W → βcontext-sensitive grammars: α1Wα2 → α1βα2.
AB → BAunrestricted (phase structure) grammars:
α1Wα2 → γ
Transformational grammars 7
The Chomsky hierarchyThe Chomsky hierarchy
Transformational grammars 8
AutomataAutomata
Each grammar has a corresponding abstract computational device – automaton.
Grammars: generative models, automata: parsers that accept or reject a given sequence.
- automata are often more easy to describe and understand than their equivalent grammars.
- automata give a more concrete idea of how we might recognise a sequence using a formal grammar.
Transformational grammars 9
Parser abstractions associated Parser abstractions associated with thewith the hierarchy of grammarshierarchy of grammars
----------------------------------------------------------------------
Grammar Parsing automaton
----------------------------------------------------------------------
regular grammars finite state automaton
context-free grammars push-down automaton
context-sensitive grammars linear bounded automaton
unrestricted grammars Turing machine
----------------------------------------------------------------------
Transformational grammars 10
Regular grammarsRegular grammars
W → aW or W → asometimes allowed: W → eRG generate sequence from left to right
(or right to left: W → Wa or W → a)RG cannot describe long-range correlations
between the terminal symbols (‘primary sequence’)
Transformational grammars 11
An odd regular grammarAn odd regular grammar
An example of a regular grammar that generates only strings of as and bs that have an odd number of as:
start from S,
S → aT | bS,
T → aS | bT | e.
Transformational grammars 12
Finite state automataFinite state automata One symbol at a time from an input string. The symbol may be accepted => the automaton
enters a new state. The symbol may not be accepted => the automaton
halts and reject the string. If the automaton reaches a final ‘accepting’ state, the
input string has been succesfully recognised and parsed by the automaton.
{states, state transitions of FSA}{nonterminals, productions of corresponding grammar}
Transformational grammars 13
FMR-1 triplet repeat regionFMR-1 triplet repeat region
Human FMR-1 mRNA sequence, fragment
. . . GCG CGG CGG CGG CGG CGG CGG CGG CGG
CGG CGG AGG CGG CGG CGG CGG CGG CGG CGG
CGG CGG AGG CGG CGG CGG CGG CGG CGG CGG
CGG CGG CTG . . .
1 2→ → 3 4 → 5→ → 6 → 7 → 8 → ε→S
ac
g gc c g g c t g
Transformational grammars 14
Moore vs. Mealy machinesMoore vs. Mealy machines
Finite automata that accept on transitions are called Mealy machines.
Finite automata that accept on states are called Moore machines. (HMM)
The two types of machines are interconvertible:
S → gW1 in the Mealy machine S → gŴ1, Ŵ1
→ gW1 in the Moore machine.
Transformational grammars 15
Deterministic vs. nondeterministic Deterministic vs. nondeterministic automataautomata
In a deterministic finite automaton, no more than one accepting transition is possible for any state and any input symbol.
An example of nondeterministic finite automaton – FMR-1.
Parsing with deterministic finite state automaton is extremely efficient [BLAST.]
Transformational grammars 16
PROSITE patternsPROSITE patterns
RU1A_HUMAN S R S L K M R G Q A F V I F K E V S S A TSXLF_DROME K L T G R P R G V A F V R Y N K R E E A QROC_HUMAN V G C S V H K G F A F V Q Y V N E R N A R
ELAV_DROME G N D T Q T K G V G F I R F D K R E E A T RNP-1 motif
[RK]– G – {EDRKHPCG} – [AGSCI] – [FY] – [LIVA] – x – [FYM]. A PROSITE pattern = pattern element - pattern element - ... - pattern
element. In a pattern element, a letter indicates the single-letter code for a
amino-acid, [] – any one of enclosed residues can occur; {} – anything but one can occur, x – any residue can occur at this position.
Transformational grammars 17
A regular grammar for PROSITE A regular grammar for PROSITE patternspatterns
S → rW1 | kW1
W1 → gW2
W2 → [afilmnqstvwy]W3
W3 → [agsci]W4
W4 → fW5 | yW5
W5 → lW6 | iW6 | vW6 | aW6
W6 → [acdefghiklmnpqrstvwy]W7
W7 → f | y | m
[ac]W means aW | cW
Transformational grammars 18
What a regular grammar can’t doWhat a regular grammar can’t do
RG cannot describe language L when:L contains all the strings of the form aa, bb,
abba, baab, abaaba, etc. (a palindrome language).
L contains all the strings of the form aa, abab, aabaab (a copy language).
Transformational grammars 19
Regular language: a b a a a b Palindrome language: a a b b a a
Copy language: a a b a a b
Palindrome and copy languages have correlations between distant positions.
Transformational grammars 20
Context-free grammarsContext-free grammars
The reason: RNA secondary structure is a kind of palindrome language.
The context-free grammars (CFG) permit additional rules that allow the grammar to create nested, long-distance pairwise correlations between terminal symbols.
S → aSa | bSb | aa | bb
S => aSa => aaSaa => aabSbaa => aabaabaa
Transformational grammars 21
A context-free grammar for an A context-free grammar for an RNA stem loopRNA stem loop
seq 1 seq 2 seq 3
A A C A C A
G A G A G A C A G G A A A C U G seq 1
G●C U●A U x C G C U G C A A A G C seq 2
A●U C●G C x U G C U G C A A C U G seq 3
C●G G●C G x G x
S → aW1u | cW1g | gW1c | uW1a, W1 → aW2u | cW2g | gW2c | uW2a
W2 → aW3u | cW3g | gW3c | uW3a, W3 → gaaa | gcaa
Transformational grammars 22
Parse treesParse trees Root – start nonterminal S, leaves – the terminal symbols in the
sequence, internal nodes are nonterminals. The children of an internal node are the productions of it. Any subtree derives a contiguous segment of the sequence.
S 5’ 3’
S S C ● G G ● C
W1 W1 A ● U G ● C
W2 W2 G ● C U ● A
W3 W3 G A G A
c a g g a a a с u g g g u g c a a a c c A A C A
Transformational grammars 23
Parse tree for a PROSITE Parse tree for a PROSITE patternpattern
Parse tree for the RNP-1 motif RGQAFVIF.
Regular grammars are linear special cases of the context-free grammars. Parse tree for a regular grammar is a standard linear alignment of the grammar nonterminals into sequence terminals.
S
W1
W2
W3
W4
W5
W6
W7
r g q a f v i f
Transformational grammars 24
Push-down automataPush-down automata
The parsing automaton for CFGs is called a push-down automaton.
A limited number of symbols are kept in a push-down stack.
A push-down automaton parses a sequence from left to right according to the algorithm.
The stack is initialised by pushing the start nonterminal into it.
The steps are iterated until no input symbols remain. If the stack is empty at the end then the sequence has been
successfully parsed.
Transformational grammars 25
Algorithm: Parsing with a push-Algorithm: Parsing with a push-down automatondown automaton
Pop a symbol off the stack. If the poped symbol is nonterminal:
- Peek ahead in the input from the current position and choose a valid production for the nonterminal. If there is no valid production, terminate and reject the sequence.- Push the right side of the chosen production rule onto the stack, rightmost symbols first.
If the poped symbol is a terminal: - Compare it to the current symbol of the input. If it
matches, move the automaton to the right on the input (the input symbol is accepted). If it does not match, terminate and reject the sequence.
Transformational grammars 26
Parsing an RNA stem loop with a Parsing an RNA stem loop with a push-down automatonpush-down automaton
Input string Stack Automaton operation on stack and inputGCCGCAAGGC S Pop S. Peek at input; produce S → g1c.GCCGCAAGGC g1c Pop g. Accept g; move right on input.GCCGCAAGGC 1c Pop 1. Peek at input; produce 1 → c2g.GCCGCAAGGC c2gc Pop c. Accept c; move right on input.GCCGCAAGGC 2gc Pop 2. Peek at input; produce 2 → c3g.GCCGCAAGGC c3ggc Pop c. Accept c; move right on input.
(several acceptances)GCCGCAAGGC c Pop c. Accept c; move right on input.GCCGCAAGGC - Stack empty. Input string empty. Accept.
Transformational grammars 27
Context-sensitive grammarsContext-sensitive grammars
Copy language: cc, acca, agaccaga, etc. initialisation:
S → CW terminal generation:nonterminal generation: CA → aCW → AÂW | GĜW | C CG → gCnonterminal reordering: ÂC → CaÂG → GÂ ĜC → CgÂA → AÂ termination:ĜA → AĜ CC → ccĜG → GĜ
Transformational grammars 28
Linear bounded automatonLinear bounded automaton A mechanism for working backwards through all possible
derivations:
either the start was reached, or valid derivation was not found.
Finite number of possible derivations to examine. Abstractly: ‘tape’ of linear memory and a read/write head. The number of possible derivations is exponentially large.
Transformational grammars 29
NP problems and ‘intractability’NP problems and ‘intractability’
Nondeterministic polynomial problems:
there is no known polynomial-time algorithm for finding a solution, but a solution can be checked for correctness in polynomial time. [Context-sensitive grammars parsing.]
A subclass of NP problems - NP-complete problems. A polynomial time algorithm that solves one NP-complete problem will solve all of them. [Context-free grammar parsing.]
Transformational grammars 30
Unrestricted grammars and Unrestricted grammars and Turing machinesTuring machines
Left and right sides of the production rules can be any combinations of symbols.
The parsing automaton is a Turing machine. There is no general algorithm for determination
whether a string has a valid derivation in less than infinite time.
Transformational grammars 31
Stochastic grammarsStochastic grammars
Stochastic grammar model generates different
strings x with probability
Non-stochastic grammars either generate a string x
or not
For stochastic regular and context-free grammars
( | )P x
( | ) 1x
P x
Transformational grammars 32
Example of stochastic grammarExample of stochastic grammar
For production rule
Stochastic regular grammar might assign
probabilities of 0.5 for the productions:
1 1|S rW kW
10,5
S rW 10,5
S kW
Transformational grammars 33
Another probabilitiesAnother probabilities
Exceptions can be admitted without grossly
degrading of a grammarExceptions should has low, but non-zero
probabilities
10,45
S rW 10,45
S kW 10,1
S nW
Transformational grammars 34
Stochastic context-sensitive or Stochastic context-sensitive or unrestricted grammarsunrestricted grammars
Context-sensitive grammar
{aa, ab, ba, bb}
In general
1 2 3 4 5
, , , , p p p p p
S aW S bW bW bb W a W b
1 4 1 5 2 4 2 3 2 5{ , , , ( )}p p p p p p p p p p
1 2 1p p
3 4 5 1p p p
Transformational grammars 35
Stochastic context-sensitive Stochastic context-sensitive grammargrammar
In fact if
and
We have that Sum of probabilities of all possible
productions from any non terminal is 1 if and only if
or
1 21
2p p 3 4 5
13p p p
1 0p 3 0p
5( | ) 6x
P x
Transformational grammars 36
Proper stochastic grammarProper stochastic grammar
Previous grammar can be changed in this
way
Now
1 2 3
, , ,p p p
S aW S bW bW bb
4 5 6
, and p p p
bW ba aW aa aW ab
1 2 3 4 5 6 1p p p p p p
Transformational grammars 37
Hidden Markov models and Hidden Markov models and stochastic regular grammarsstochastic regular grammars
Any HMM state which makes N transitions
to new states that each emit one of M
symbols can also be modeled by a set of
NM stochastic regular grammar
productions.
Transformational grammars 38
Stochastic context-free Stochastic context-free grammars for sequence modelinggrammars for sequence modeling
We can use stochastic context-free grammars
for sequence modeling.
To do it we should solve these problems:
(i) calculate an optimal alignment of a
sequence to a parameterized stochastic
grammar (the alignment problem).
Transformational grammars 39
Other problemsOther problems
(ii) Calculate the probability of a sequence
given a parameterized stochastic
grammar (the scoring problem).
(iii) Given a set of example sequences,
estimate optimal probability
parameters for an unparameterised
stochastic grammar (the training problem).
Transformational grammars 40
Normal forms for stochastic Normal forms for stochastic context-free grammarscontext-free grammars
Chomsky normal form; production rules should be like this:
or
For example, production rule
could be expanded to
in Chomsky normal form.
v y zW W W vW a
S aSa
1 2 1 2 1, , S WW W a W sW
Transformational grammars 41
The inside-outside algorithm for The inside-outside algorithm for SGFCsSGFCs
The inside-outside algorithm for SGFCs in
Chomsky normal form is the natural
counterpart of the forward-backward
algorithm for HMMs
Computational complexity of inside-outside
algorithm is substantially greater
Transformational grammars 42
The inside algorithmThe inside algorithm
Let we have Chomsky normal form SCFG with M
nonterminals W1,W2,…WM
start from W1
Production rules are: Wv WyWz and Wv a
Probability parameters for this productions are:
tv(y,z) and ev(a) respectively
Transformational grammars 43
The inside algorithmThe inside algorithm
Algorithm calculates the probability
of a parse subtree rooted at nonterminal Wv
for subsequence xi,…,xj for all i, j and v
The calculations requires an
three-dimensional dynamic programming
matrix
( , , )i j v
L L M
Transformational grammars 44
Algorithm: InsideAlgorithm: Inside
Initialisation: for i=1 to L, v=1 to M:
Iteration: for i=1 to L-1, j=i+1 to L, v=1 to M:
Termination:
( , , ) ( )v ii j v e x
1
1 1
( , , ) ( , , ) ( 1, , ) ( , )jM M
vy z k i
i j v i k y k j z t y z
( | ) (1, ,1)P x L
Transformational grammars 45
Iteration step of the inside Iteration step of the inside algorithmalgorithm
Transformational grammars 46
The outside algorithmThe outside algorithm
Algorithm calculates the probability
of a complete parse tree rooted at the start
nonterminal for the sequence x1,…,xL
excluding all parse subtrees for sequence xi,
…,xj rooted at nonterminal Wv for all i, j and
v
( , , )i j v
Transformational grammars 47
The outside algorithmThe outside algorithm
The calculations requires an
three-dimensional dynamic programming
matrix (like the inside algorithm)
Calculating requires the results
from a previous inside
calculation
L L M
( , , )i j v( , , )i j v
Transformational grammars 48
Algorithm: OutsideAlgorithm: Outside Initialisation:
for v=2 to M.
Iteration: for i = 1 to L, j =L to I, v =1 to M:
Termination:
(1, ,1) 1;L
(1, , ) 0L v
1
1 1 1
( , , ) ( , 1, ) ( , , ) ( , )M M i
yy z k
i j v k i z k j y t z v
1 1 1
( 1, , ) ( , , ) ( , ).M M L
yy z k j
j k z i k y t v z
1
( | ) ( , , ) ( )M
v iv
P x i i v e x
Transformational grammars 49
Iteration step of the outside Iteration step of the outside algorithmalgorithm
Transformational grammars 50
Parameter re-estimation by Parameter re-estimation by expectation maximisationexpectation maximisation
( 1, , ) ( , )vk j z t y z
1 1
1( ) ( , , ) ( , , )
( | )
L L
i j
c v i j v i j vP x
11
1 1
1( ) ( , , ) ( , , )
( | )
jL L
i j i k i
c v yz i j v i k yP x
Transformational grammars 51
Parameter re-estimation by Parameter re-estimation by expectation maximisationexpectation maximisation
Re-estimation equation for probabilities of the production rules Wv WyWz is
Production rule Wv a:
( )ˆ ( , )( )v
c v yzt y z
c v
( )ˆ ( )
( )v
c v ae a
c v
1
1
( , , ) ( )
( , , ) ( , , )i
Li vx a
L L
i j i
i i v e a
i j v i j v
Transformational grammars 52
The CYK alignment algorithmThe CYK alignment algorithm Initialisation: for i=1 to L, v=1 to M:
Iteration: for i=1 to L-1, j=i+1 to L, v=1 to M
Termination:
( , , ) log ( ); ( , , ) (0,0,0)v ii i v e x i i v
, 1( , , ) max max { ( , , ) ( 1, , ) log ( , )}v
y z i k ji j v i k y k j z t y z
( , , 1)( , , ) arg max { ( , , ) ( 1, , ) log ( , )}v
y z i k ji j v i k y k j z t y z
ˆlog ( , | ) (1, ,1)P x L
Transformational grammars 53
CYK tracebackCYK traceback
Initialisation:
Push (1, L, 1) on the stackIteration:
Pop (i, j, v); (y, z, k) =
If =(0, 0, 0) then attach xi as a child of v
else attach y, z to parse tree as children of v
Push (k+1, j, z). Push(i, k, y)
( , , )i j v( , , )i j v
Transformational grammars 54
SummarySummaryGoal HMM algorithm SCFG algorithm
optimal alignment
Viterbi CYK
forward inside
EM parameter estimation
forward-backward
Inside-outside
memory complexity
O(LM) O(L2M)
Time complexity
O(LM2) O(L3M3)
( | )P x