Grammar Induction: •Laurent Miclet, Tim Oates, Jose...

34
1 1 ICML 2006, Grammatical Inference 1 Colin de la Higuera, Université de Saint-Etienne Tim Oates, University of Maryland Grammar Induction: Techniques and Theory 2 ICML 2006, Grammatical Inference 2 Acknowledgements • Laurent Miclet, Tim Oates, Jose Oncina, Rafael Carrasco, Paco Casacuberta, Pedro Cruz, Rémi Eyraud, Philippe Ezequel, Henning Fernau, Jean-Christophe Janodet, Thierry Murgue, Frédéric Tantini, Franck Thollard, Enrique Vidal,... • … and a lot of other people to whom we are grateful 3 ICML 2006, Grammatical Inference 3 Outline 1 An introductory example 2 About grammatical inference 3 Some specificities of the task 4 Some techniques and algorithms 5 Open issues and questions 4 ICML 2006, Grammatical Inference 4 1 How do we learn languages? A very simple example Carmel and Markovitch 98 & 99 http://www.cs.technion.ac.il/~carmel/papers.html 5 ICML 2006, Grammatical Inference 5 The problem: • An agent must take cooperative decisions in a multi-agent world. • His decisions will depend: –on the actions of other agents; –on what he hopes to win or lose. 6 ICML 2006, Grammatical Inference 6 Hypothesis: the opponent follows a rational strategy (given by a DFA/Moore machine): e e p p l l d p e e e p e p l e e e d You: listen or doze Me: equations or pictures

Transcript of Grammar Induction: •Laurent Miclet, Tim Oates, Jose...

Page 1: Grammar Induction: •Laurent Miclet, Tim Oates, Jose ...pagesperso.lina.univ-nantes.fr/~cdlh/ICML06/icml_gi_slides_6pp.pdf · 1 ICML 2006, Grammatical Inference 1 1 Colin de la Higuera,

1

1ICML 2006, Grammatical Inference1

Colin de la Higuera, Université de Saint-EtienneTim Oates, University of Maryland

Grammar Induction: Techniques and Theory

2ICML 2006, Grammatical Inference2

Acknowledgements• Laurent Miclet, Tim Oates, Jose Oncina, Rafael Carrasco, PacoCasacuberta, Pedro Cruz, RémiEyraud, Philippe Ezequel, Henning Fernau, Jean-Christophe Janodet, Thierry Murgue, Frédéric Tantini, Franck Thollard, Enrique Vidal,...

• … and a lot of other people to whom we are grateful

3ICML 2006, Grammatical Inference3

Outline

1 An introductory example

2 About grammatical inference

3 Some specificities of the task

4 Some techniques and algorithms

5 Open issues and questions

4ICML 2006, Grammatical Inference4

1 How do we learn languages?

A very simple example

Carmel and Markovitch 98 & 99http://www.cs.technion.ac.il/~carmel/papers.html

5ICML 2006, Grammatical Inference5

The problem:

• An agent must take cooperative decisions in a multi-agent world.

• His decisions will depend:

– on the actions of other agents;

– on what he hopes to win or lose.

6ICML 2006, Grammatical Inference6

Hypothesis: the opponent follows a rational strategy (given by a DFA/Moore machine):

e e

pp

l l d

p e

e e p e p → le e e → d

You: listenor doze

Me: equations or pictures

Page 2: Grammar Induction: •Laurent Miclet, Tim Oates, Jose ...pagesperso.lina.univ-nantes.fr/~cdlh/ICML06/icml_gi_slides_6pp.pdf · 1 ICML 2006, Grammatical Inference 1 1 Colin de la Higuera,

2

7ICML 2006, Grammatical Inference7

Example: (the prisoner’s dilemma)

• Each prisoner can admit (a) or stay silent (s)

• If both admit: 3 years each;

• If A admits but not B, A=0 years, B=5 years;

• If B admits but not A, B=0 years, A=5 years;

• If neither admits: 1 year each.

8ICML 2006, Grammatical Inference8

a

a

s

s

-3

-3

0

-5

0

-5

-1

-1

AB

9ICML 2006, Grammatical Inference9

• Here an iterated version against an opponent that follows a rational strategy.

• Gain Function: limit of means.

• A game is a word in

(His_moves × My_moves)*!

10ICML 2006, Grammatical Inference10

The general problem

• We suppose that the strategy of the opponent is given by a deterministic finite automaton.

• Can we imagine an optimal strategy?

11ICML 2006, Grammatical Inference11

Suppose we know the opponent’s strategy:

• Then (game theory):

• Consider the opponent’s graph in which we value the edges by our own gain.

12ICML 2006, Grammatical Inference12

a s

a

s

-3 0

-5 -1

s s

aa

a s s

a s-3

-5 -1

0

-1

0

Page 3: Grammar Induction: •Laurent Miclet, Tim Oates, Jose ...pagesperso.lina.univ-nantes.fr/~cdlh/ICML06/icml_gi_slides_6pp.pdf · 1 ICML 2006, Grammatical Inference 1 1 Colin de la Higuera,

3

13ICML 2006, Grammatical Inference13

1 Find the cycle of maximum mean weight.

2 Find the best path leading to this cycle of maximum mean weight.

3 Follow the path and stay in the cycle.

All that is needed is to find the opponent’s automaton!

Then

14ICML 2006, Grammatical Inference14

a s

a

s

-3 0

-5 -1

s s

aa

a s s

a s

Mean= -0.5

Best path

-3

-5 -1

0

-1

0

15ICML 2006, Grammatical Inference15

Question

• Having seen a game of this opponent…

• Can we reconstruct his strategy ?

16ICML 2006, Grammatical Inference16

Data (him, me) : {aa as sa aa as ssss ss sa}

HIM MEa aa ss aa aa ss ss ss as a

I play asa, his move is a

17ICML 2006, Grammatical Inference17

λ→ a

a→a

as → s

asa → a

asaa → a

asaas → s

asaass → s

asaasss → s

asaasssa → s

a

18ICML 2006, Grammatical Inference18

λ → a

a → a

as → s

asa → a

asaa → a

asaas → s

asaass → s

asaasss → s

asaasssa → s

a

a

Page 4: Grammar Induction: •Laurent Miclet, Tim Oates, Jose ...pagesperso.lina.univ-nantes.fr/~cdlh/ICML06/icml_gi_slides_6pp.pdf · 1 ICML 2006, Grammatical Inference 1 1 Colin de la Higuera,

4

19ICML 2006, Grammatical Inference19

λ → a

a → a

as → s

asa → a

asaa → a

asaas → s

asaass → s

asaasss → s

asaasssa → s

a

a

20ICML 2006, Grammatical Inference20

λ → a

a → a

as → s

asa → a

asaa → a

asaas → s

asaass → s

asaasss → s

asaasssa → s

a

a, s

21ICML 2006, Grammatical Inference21

λ → a

a → a

as → s

asa → a

asaa → a

asaas → s

asaass → s

asaasss → s

asaasssa → s

sa

a

s

22ICML 2006, Grammatical Inference22

λ → a

a → a

as → s

asa → a

asaa → a

asaas → s

asaass → s

asaasss → s

asaasssa → s

sa

a

s

a

23ICML 2006, Grammatical Inference23

λ → a

a → a

as → s

asa → a

asaa → a

asaas → s

asaass → s

asaasss → s

asaasssa → s

sa

a

s

a,s

24ICML 2006, Grammatical Inference24

λ → a

a → a

as → s

asa → a

asaa → a

asaas → s

asaass → s

asaasss → s

asaasssa → s

sa

a

s

a

s

Page 5: Grammar Induction: •Laurent Miclet, Tim Oates, Jose ...pagesperso.lina.univ-nantes.fr/~cdlh/ICML06/icml_gi_slides_6pp.pdf · 1 ICML 2006, Grammatical Inference 1 1 Colin de la Higuera,

5

25ICML 2006, Grammatical Inference25

λ → a

a → a

as → s

asa → a

asaa → a

asaas → s

asaass → s

asaasss → s

asaasssa → s

ssa

a

s

a

s

26ICML 2006, Grammatical Inference26

λ → a

a → a

as → s

asa → a

asaa → a

asaas → s

asaass → s

asaasss → s

asaasssa → s

ssa

a

s

a

s

s

27ICML 2006, Grammatical Inference27

λ → a

a → a

as → s

asa → a

asaa → a

asaas → s

asaass → s

asaasss → s

asaasssa → s

s

ssa

a

s

a

s

28ICML 2006, Grammatical Inference28

λ → a

a → a

as → s

asa → a

asaa → a

asaas → s

asaass → s

asaasss → s

asaasssa → s

s

ssa

a

s

a

s

29ICML 2006, Grammatical Inference29

λ → a

a → a

as → s

asa → a

asaa → a

asaas → s

asaass → s

asaasss → s

asaasssa → s

s

ssa

a

s

a

s

a

30ICML 2006, Grammatical Inference30

λ → a

a → a

as → s

asa → a

asaa → a

asaas → s

asaass → s

asaasss → s

asaasssa → s

s

ssa

a

s

a

s

a

Page 6: Grammar Induction: •Laurent Miclet, Tim Oates, Jose ...pagesperso.lina.univ-nantes.fr/~cdlh/ICML06/icml_gi_slides_6pp.pdf · 1 ICML 2006, Grammatical Inference 1 1 Colin de la Higuera,

6

31ICML 2006, Grammatical Inference31

s

ssa

a

s

a

s

a

32ICML 2006, Grammatical Inference32

How do we get hold of the learning data?

a) through observation

b) through exploration

33ICML 2006, Grammatical Inference33

An open problemThe strategy is probabilistic:

s

a:70%S:30%

a:50%S:50%

a:20%S:80%

a

s

a

s

a

34ICML 2006, Grammatical Inference34

Tit for Tat

sa

a

s

a

s

35ICML 2006, Grammatical Inference35

2 Specificities of grammatical inference

Grammatical inference consists (roughly) in finding the (a) grammar or automaton that has produced a given set of strings (sequences, trees, terms, graphs).

36ICML 2006, Grammatical Inference36

The goal/idea

• Old Greeks:

A whole is more than the sum of all parts

• Gestalt theory

A whole is different than the sum of all parts

Page 7: Grammar Induction: •Laurent Miclet, Tim Oates, Jose ...pagesperso.lina.univ-nantes.fr/~cdlh/ICML06/icml_gi_slides_6pp.pdf · 1 ICML 2006, Grammatical Inference 1 1 Colin de la Higuera,

7

37ICML 2006, Grammatical Inference37

Better said

• There are cases where the data cannot be analyzed by considering it in bits

• There are cases where intelligibility of the pattern is important

38ICML 2006, Grammatical Inference38

Nothing Lots

What do people know about formal language theory?

39ICML 2006, Grammatical Inference39

A small reminder on formal language theory

• Chomsky hierarchy

• + and – of grammars

40ICML 2006, Grammatical Inference40

A crash course in Formal language theory

• Symbols

• Strings

• Languages

• Chomsky hierarchy

• Stochastic languages

41ICML 2006, Grammatical Inference41

Symbols

are taken from some alphabet Σ

Stringsare sequences of symbols from Σ

42ICML 2006, Grammatical Inference42

Languages

are sets of strings over Σ

Languagesare subsets of Σ*

Page 8: Grammar Induction: •Laurent Miclet, Tim Oates, Jose ...pagesperso.lina.univ-nantes.fr/~cdlh/ICML06/icml_gi_slides_6pp.pdf · 1 ICML 2006, Grammatical Inference 1 1 Colin de la Higuera,

8

43ICML 2006, Grammatical Inference43

Special languages

• Are recognised by finite state automata

• Are generated by grammars

44ICML 2006, Grammatical Inference44

a

b

a

b

a

b

DFA: Deterministic Finite State Automaton

45ICML 2006, Grammatical Inference45

a

b

a

b

a

b

abab∈L46ICML 2006, Grammatical Inference

46

What is a context free grammar?A 4-tuple (Σ, S, V, P) such that:

– Σ is the alphabet;

– V is a finite set of non terminals;

– S is the start symbol;

– P ∈ V × (V∪Σ)* is a finite set of rules.

47ICML 2006, Grammatical Inference47

Example of a grammarThe Dyck1 grammar

– (Σ, S, V, P)

– Σ = {a, b}

– V = {S}

– P = {S → aSbS, S → λ }

48ICML 2006, Grammatical Inference48

Derivations and derivation trees

S → aSbS

→ aaSbSbS

→ aabSbS

→ aabbS

→ aabb

a

a

b

b

S

SS

S

S

λ

λ

λ

Page 9: Grammar Induction: •Laurent Miclet, Tim Oates, Jose ...pagesperso.lina.univ-nantes.fr/~cdlh/ICML06/icml_gi_slides_6pp.pdf · 1 ICML 2006, Grammatical Inference 1 1 Colin de la Higuera,

9

49ICML 2006, Grammatical Inference49

Chomsky Hierarchy

• Level 0: no restriction

• Level 1: context-sensitive

• Level 2: context-free

• Level 3: regular

50ICML 2006, Grammatical Inference50

Chomsky Hierarchy• Level 0: Whatever Turing machines can do

• Level 1: – {anbncn: n∈ }– {anbmcndm : n,m∈ }– {uu: u∈Σ*}

• Level 2: context-free– {anbn: n∈ }– brackets

• Level 3: regular– Regular expressions (GREP)

51ICML 2006, Grammatical Inference51

The membership problem

• Level 0: undecidable

• Level 1: decidable

• Level 2: polynomial

• Level 3: linear

52ICML 2006, Grammatical Inference52

The equivalence problem

• Level 0: undecidable

• Level 1: undecidable

• Level 2: undecidable

• Level 3: Polynomial only when the representation is DFA.

53ICML 2006, Grammatical Inference53

41

31

21

21

21

32

b

b

a

a

a

b

43

21

PFA: Probabilistic Finite (state) Automaton

54ICML 2006, Grammatical Inference54

0.1

0.3

a

b

a

b

a

b

0.65

0.350.9

0.7

0.3

0.7

DPFA: Deterministic Probabilistic Finite (state) Automaton

Page 10: Grammar Induction: •Laurent Miclet, Tim Oates, Jose ...pagesperso.lina.univ-nantes.fr/~cdlh/ICML06/icml_gi_slides_6pp.pdf · 1 ICML 2006, Grammatical Inference 1 1 Colin de la Higuera,

10

55ICML 2006, Grammatical Inference55

What is nice with grammars?

• Compact representation

• Recursivity

• Says how a string belongs, not just if it belongs

• Graphical representations (automata, parse trees)

56ICML 2006, Grammatical Inference56

What is not so nice with grammars?

• Even the easiest class (level 3) contains SAT, Boolean functions, parity functions…

• Noise is very harmful:

– Think about putting edit noise to language {w: |w|a=0[2]∧|w|b=0[2]}

57ICML 2006, Grammatical Inference57

Inductive Inference Pattern Recognition

Grammatical Inference

The field

Machine Learning

Computational linguistics Computational biology Web technologies

58ICML 2006, Grammatical Inference58

The data

• Strings, trees, terms, graphs

• Structural objects

• Basically the same gap of information as in programming between tables/arrays and data structures

59ICML 2006, Grammatical Inference59

Alternatives to grammatical inference

• 2 steps:

– Extract features from the strings

– Use a very good method over Rn.

60ICML 2006, Grammatical Inference60

Examples of stringsA string in Gaelic and its translation to English:

• Tha thu cho duaichnidh ri èarràirde de a’ coisich deas damh

•You are as ugly as the north end of a southward traveling ox

Page 11: Grammar Induction: •Laurent Miclet, Tim Oates, Jose ...pagesperso.lina.univ-nantes.fr/~cdlh/ICML06/icml_gi_slides_6pp.pdf · 1 ICML 2006, Grammatical Inference 1 1 Colin de la Higuera,

11

61ICML 2006, Grammatical Inference61 62ICML 2006, Grammatical Inference

62

63ICML 2006, Grammatical Inference63

>A BAC=41M14 LIBRARY=CITB_978_SKBAAGCTTATTCAATAGTTTATTAAACAGCTTCTTAAATAGGATATAAGGCAGTGCCATGTAGTGGATAAAAGTAATAATCATTATAATATTAAGAACTAATACATACTGAACACTTTCAATGGCACTTTACATGCACGGTCCCTTTAATCCTGAAAAAATGCTATTGCCATCTTTATTTCAGAGACCAGGGTGCTAAGGCTTGAGAGTGAAGCCACTTTCCCCAAGCTCACACAGCAAAGACACGGGGACACCAGGACTCCATCTACTGCAGGTTGTCTGACTGGGAACCCCCATGCACCTGGCAGGTGACAGAAATAGGAGGCATGTGCTGGGTTTGGAAGAGACACCTGGTGGGAGAGGGCCCTGTGGAGCCAGATGGGGCTGAAAACAAATGTTGAATGCAAGAAAAGTCGAGTTCCAGGGGCATTACATGCAGCAGGATATGCTTTTTAGAAAAAGTCCAAAAACACTAAACTTCAACAATATGTTCTTTTGGCTTGCATTTGTGTATAACCGTAATTAAAAAGCAAGGGGACAACACACAGTAGATTCAGGATAGGGGTCCCCTCTAGAAAGAAGGAGAAGGGGCAGGAGACAGGATGGGGAGGAGCACATAAGTAGATGTAAATTGCTGCTAATTTTTCTAGTCCTTGGTTTGAATGATAGGTTCATCAAGGGTCCATTACAAAAACATGTGTTAAGTTTTTTAAAAATATAATAAAGGAGCCAGGTGTAGTTTGTCTTGAACCACAGTTATGAAAAAAATTCCAACTTTGTGCATCCAAGGACCAGATTTTTTTTAAAATAAAGGATAAAAGGAATAAGAAATGAACAGCCAAGTATTCACTATCAAATTTGAGGAATAATAGCCTGGCCAACATGGTGAAACTCCATCTCTACTAAAAATACAAAAATTAGCCAGGTGTGGTGGCTCATGCCTGTAGTCCCAGCTACTTGCGAGGCTGAGGCAGGCTGAGAATCTCTTGAACCCAGGAAGTAGAGGTTGCAGTAGGCCAAGATGGCGCCACTGCACTCCAGCCTGGGTGACAGAGCAAGACCCTATGTCCAAAAAAAAAAAAAAAAAAAAGGAAAAGAAAAAGAAAGAAAACAGTGTATATATAGTATATAGCTGAAGCTCCCTGTGTACCCATCCCCAATTCCATTTCCCTTTTTTGTCCCAGAGAACACCCCATTCCTGACTAGTGTTTTATGTTCCTTTGCTTCTCTTTTTAAAAACTTCAATGCACACATATGCATCCATGAACAACAGATAGTGGTTTTTGCATGACCTGAAACATTAATGAAATTGTATGATTCTAT

64ICML 2006, Grammatical Inference64

����1

� � � � � � �� � � � � � � � � � � �

65ICML 2006, Grammatical Inference65 66ICML 2006, Grammatical Inference

66

Page 12: Grammar Induction: •Laurent Miclet, Tim Oates, Jose ...pagesperso.lina.univ-nantes.fr/~cdlh/ICML06/icml_gi_slides_6pp.pdf · 1 ICML 2006, Grammatical Inference 1 1 Colin de la Higuera,

12

67ICML 2006, Grammatical Inference67

<book><part><chapter>

<sect1/><sect1>

<orderedlist numeration="arabic"><listitem/><f:fragbody/>

</orderedlist></sect1>

</chapter></part>

</book>

68ICML 2006, Grammatical Inference68

<?xml version="1.0"?><?xml-stylesheet href="carmen.xsl" type="text/xsl"?><?cocoon-process type="xslt"?><!DOCTYPE pagina [<!ELEMENT pagina (titulus?, poema)><!ELEMENT titulus (#PCDATA)><!ELEMENT auctor (praenomen, cognomen, nomen)><!ELEMENT praenomen (#PCDATA)><!ELEMENT nomen (#PCDATA)><!ELEMENT cognomen (#PCDATA)><!ELEMENT poema (versus+)><!ELEMENT versus (#PCDATA)>]><pagina><titulus>Catullus II</titulus><auctor><praenomen>Gaius</praenomen><nomen>Valerius</nomen><cognomen>Catullus</cognomen></auctor>

69ICML 2006, Grammatical Inference69 70ICML 2006, Grammatical Inference

70

A logic program learned by GIFTcolor_blind(Arg1) :-

start(Arg1,X),p11(Arg1,X).

start(X,X).

p11(Arg1,P) :- mother(M,P),p4(Arg1, M).p4(Arg1,X) :-

woman(X),father(F,X),p3(Arg1,F).p4(Arg1,X) :-

woman(X),mother(M,X),p4(Arg1,M).p3(Arg1,X) :- man(X),color_blind(X).

71ICML 2006, Grammatical Inference71

3 Hardness of the task– One thing is to build algorithms, another is to be able to state that it works.

– Some questions:– Does this algorithm work?

– Do I have enough learning data?

– Do I need some extra bias?

– Is this algorithm better than the other?

– Is this problem easier than the other?

72ICML 2006, Grammatical Inference72

Alternatives to answer these questions:

– Use benchmarks

– Solve a real problem

– Prove things

Page 13: Grammar Induction: •Laurent Miclet, Tim Oates, Jose ...pagesperso.lina.univ-nantes.fr/~cdlh/ICML06/icml_gi_slides_6pp.pdf · 1 ICML 2006, Grammatical Inference 1 1 Colin de la Higuera,

13

73ICML 2006, Grammatical Inference73

Theory

• Because you may want to be able to say something more than « seems to work in practice ».

74ICML 2006, Grammatical Inference74

Convergence

• Does my algorithm converge in some sense to a best solution.

• To be able to answer, we have to admit the existence of a best solution.

75ICML 2006, Grammatical Inference75

Issues

• Get close to the best?

– Metrics

– Distributions over strings

• PAC related model and similar: very negative results

76ICML 2006, Grammatical Inference76

Identification in the limit

L Pres ⊆N→XA class of languages

A class of grammarsG

L A learnerThe naming function

yields

ϕ

f(N)=g(N) ⇒yields(f)=yields(g)L(ϕ(f))=yields(f)

77ICML 2006, Grammatical Inference77

f1 f2

h1 h2

fn

hn

fi

hi ≡ hn

L(hi)= L

L is identifiable in the limit in terms of Gfrom Pres iff

∀L∈L, ∀f∈Pres(L)

78ICML 2006, Grammatical Inference78

No quería componer otro Quijote —lo cual es fácil— sino el Quijote. Inútil agregar que no encaró nunca una transcripción mecánica del original; no se proponía copiarlo. Su admirable ambición era producir unas páginas que coincidieran -palabra por palabra y línea por línea-con las de Miguel de Cervantes.

[…]

“Mi empresa no es difícil, esencialmente” leo en otro lugar de la carta. “Me bastaría ser inmortal para llevarla a cabo.”

Jorge Luis Borges(1899–1986)Pierre Menard, autor del Quijote (El jardín de senderos que

se bifurcan) Ficciones

Page 14: Grammar Induction: •Laurent Miclet, Tim Oates, Jose ...pagesperso.lina.univ-nantes.fr/~cdlh/ICML06/icml_gi_slides_6pp.pdf · 1 ICML 2006, Grammatical Inference 1 1 Colin de la Higuera,

14

79ICML 2006, Grammatical Inference79

4 Algorithmic ideas

80ICML 2006, Grammatical Inference80

The space of GI problems

• Type of input (strings)

• Presentation of input (batch)

• Hypothesis space (subset of the regular grammars)

• Success criteria (identifi-cation in the limit)

81ICML 2006, Grammatical Inference81

Types of input

the cat hates the dogStrings:

StructuralExamples:

cat dog the the hates

(+)

(-)

Graphs:

82ICML 2006, Grammatical Inference82

Types of input - oracles• Membership queries

– Is string S in the target language?

• Equivalence queries– Is my hypothesis correct?

– If not, provide counter example

• Subset queries– Is the language of my hypothesis a subset of the target language?

83ICML 2006, Grammatical Inference83

Presentation of input

• Arbitrary order

• Shortest to longest

• All positive and negative examples up to some length

• Sampled according to some probability distribution

84ICML 2006, Grammatical Inference84

Presentation of input

• Text presentation

– A presentation of all strings in the target language

• Complete presentation (informant)

– A presentation of all strings over the alphabet of the target language labeled as + or -

Page 15: Grammar Induction: •Laurent Miclet, Tim Oates, Jose ...pagesperso.lina.univ-nantes.fr/~cdlh/ICML06/icml_gi_slides_6pp.pdf · 1 ICML 2006, Grammatical Inference 1 1 Colin de la Higuera,

15

85ICML 2006, Grammatical Inference85

Hypothesis space

• Regular grammars

– A welter of subclasses

• Context free grammars

– Fewer subclasses

• Hyper-edge replacement graph grammars

86ICML 2006, Grammatical Inference86

Success criteria

• Identification in the limit

– Text or informant presentation

– After each example, learner guesses language

– At some point, guess is correct and never changes

• PAC learning

87ICML 2006, Grammatical Inference87

Theorem’s due to Gold• The good news

– Any recursively enumerable class of languages can be learned in the limit from an informant (Gold, 1967)

• The bad news– A language class is superfinite if it includes all finite languages and at least one infinite language

– No superfinite class of languages can be learned in the limit from a text (Gold, 1967)

– That includes regular and context-free

88ICML 2006, Grammatical Inference88

A picture

Little information

A lot of information

Poor languages Rich Languages

Sub-classes of reg, from pos

Mildly context sensitive, from queries

DFA, from queries

Context-free, from pos

DFA, from pos+neg

89ICML 2006, Grammatical Inference89

AlgorithmsRPNI

K-Reversible

GRIDS

SEQUITUR

L*

90ICML 2006, Grammatical Inference90

4.1 RPNI

• Regular Positive and Negative Grammatical Inference

Identifying regular languages in polynomial time

Jose Oncina & Pedro García 1992

Page 16: Grammar Induction: •Laurent Miclet, Tim Oates, Jose ...pagesperso.lina.univ-nantes.fr/~cdlh/ICML06/icml_gi_slides_6pp.pdf · 1 ICML 2006, Grammatical Inference 1 1 Colin de la Higuera,

16

91ICML 2006, Grammatical Inference91

• It is a state merging algorithm;

• It identifies any regular language in the limit;

• It works in polynomial time;

• It admits polynomial charac-teristic sets.

92ICML 2006, Grammatical Inference92

The algorithm

function rmerge(A,p,q)

A = merge(A,p,q)

while ∃a∈Σ, p,q∈δA(r,a), p≠qdo

rmerge(A,p,q)

93ICML 2006, Grammatical Inference93

A=PTA(X); Fr ={δ(q0,a): a∈Σ };

K ={q0};

While Fr≠∅ do

choose q from Fr

if ∃p∈K: L(rmerge(A,p,q))∩X-=∅then A = rmerge(A,p,q)

else K = K ∪ {q}

Fr = {δ(q,a): q∈K} – {K}

94ICML 2006, Grammatical Inference94

X+={λ, aaa, aaba, ababa, bb, bbaaa}

a

a

aa

b

b

b

a

a

a

ba b

a

X-={aa, ab, aaaa, ba}

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

95ICML 2006, Grammatical Inference95

Try to merge 2 and 1

a

a

aa

b

b

b

a

a

a

ba b

a

X-={aa, ab, aaaa, ba}

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

96ICML 2006, Grammatical Inference96

Needs more merging for determinization

aa

aa

b

b

b

a

a

a

b a ba

X-={aa, ab, aaaa, ba}

1,2

3

4

5

6

7

8

9

10

11

12

13

14

15

Page 17: Grammar Induction: •Laurent Miclet, Tim Oates, Jose ...pagesperso.lina.univ-nantes.fr/~cdlh/ICML06/icml_gi_slides_6pp.pdf · 1 ICML 2006, Grammatical Inference 1 1 Colin de la Higuera,

17

97ICML 2006, Grammatical Inference97

But now string aaaa is accepted, so the merge must be rejected

a

b

b a

a

a

ab

a

X-={aa, ab, aaaa, ba}

1,2,4,7

3,5,8 6

9, 11

10

12

13

14

15

98ICML 2006, Grammatical Inference98

Try to merge 3 and 1

a

a

aa

b

b

b

a

a

a

ba b

a

X-={aa, ab, aaaa, ba}

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

99ICML 2006, Grammatical Inference99

Requires to merge 6 with {1,3}

a

a

aa

b

b

a

a

a

ba b

a

X-={aa, ab, aaaa, ba}

1,3

2

4

5

6

7

8

9

10

11

12

13

14

15

b

10ICML 2006, Grammatical Inference100

And now to merge 2 with 10

a

a

aa

b

aa

a

ba b

a

X-={aa, ab, aaaa, ba}

1,3,6

2

4

5

7

8

9

10

11

12

13

14

15

b

10ICML 2006, Grammatical Inference101

And now to merge 4 with 13

a

a

aa

b

a

ba b

a

X-={aa, ab, aaaa, ba}

1,3,6

2,10 4

5

7

8

9

11

12

13

14

15

b a

10ICML 2006, Grammatical Inference102

And finally to merge 7 with 15

a

a

aa

b

a

ba b

a

X-={aa, ab, aaaa, ba}

1,3,6

2,10

4,13

5

7

8

9

11

12

14

15

b

Page 18: Grammar Induction: •Laurent Miclet, Tim Oates, Jose ...pagesperso.lina.univ-nantes.fr/~cdlh/ICML06/icml_gi_slides_6pp.pdf · 1 ICML 2006, Grammatical Inference 1 1 Colin de la Higuera,

18

10ICML 2006, Grammatical Inference103

No counter example is accepted so the merges are kept

a

a

aa

bb

a ba

X-={aa, ab, aaaa, ba}

1,3,6

2,10

4,13

5

7,15

8

9

11

12

14

b

10ICML 2006, Grammatical Inference104

Next possible merge to be checked is {4,13} with {1,3,6}

a

a

aa

bb

a ba

X-={aa, ab, aaaa, ba}

1,3,6

2,10

4,13

5

7,15

8

9

11

12

14

b

10ICML 2006, Grammatical Inference105

a

a

ab

ba b

a

X-={aa, ab, aaaa, ba}

1,3,4,6,13

2,10

5

7,15

8

9

11

12

14

b

a

More merging for determinizationis needed

10ICML 2006, Grammatical Inference106

ab

a ba

X-={aa, ab, aaaa, ba}

1,3,4,6,8,13

2,7,10,11,15

5

9

12

14

b

a

But now aa is accepted

10ICML 2006, Grammatical Inference107

So we try {4,13} with {2,10}

a

a

aa

bb

a ba

X-={aa, ab, aaaa, ba}

1,3,6

2,10

4,13

5

7,15

8

9

11

12

14

b

10ICML 2006, Grammatical Inference108

After determinizing, negative string aa is again accepted

a ba b

a

X-={aa, ab, aaaa, ba}

1,3,62,4,7,10,13,15

5,89,11 12

14

b

a

Page 19: Grammar Induction: •Laurent Miclet, Tim Oates, Jose ...pagesperso.lina.univ-nantes.fr/~cdlh/ICML06/icml_gi_slides_6pp.pdf · 1 ICML 2006, Grammatical Inference 1 1 Colin de la Higuera,

19

10ICML 2006, Grammatical Inference109

So we try 5 with {1,3,6}

a

a

aa

bb

a ba

X-={aa, ab, aaaa, ba}

1,3,6

2,10

4,13

5

7,15

8

9

11

12

14

b

11ICML 2006, Grammatical Inference110

But again we accept ab

aa

aa

b

b

X-={aa, ab, aaaa, ba}

1,3,5,6,12

2,9,10,144,13

7,15

8

11

b

11ICML 2006, Grammatical Inference111

So we try 5 with {2,10}

a

a

aa

bb

a ba

X-={aa, ab, aaaa, ba}

1,3,6

2,10

4,13

5

7,15

8

9

11

12

14

b

11ICML 2006, Grammatical Inference112

Which is OK. So next possible merge is {7,15} with {1,3,6}

a

a

a

ab

b

X-={aa, ab, aaaa, ba}

1,3,6

2,5,10

4,9,13

7,15

8,12

11,14

b

11ICML 2006, Grammatical Inference113

Which is OK. Now try to merge {8,12} with {1,3,6,7,15}

aa

a

ab

a

X-={aa, ab, aaaa, ba}

1,3,6,7,15

2,5,10

4,9,13

8,12

11,14

b

b

11ICML 2006, Grammatical Inference114

And ab is accepted

a

a

b

a

X-={aa, ab, aaaa, ba}

1,3,6,7,8,12,15

2,5,10,11,14

4,9,13

b

b

Page 20: Grammar Induction: •Laurent Miclet, Tim Oates, Jose ...pagesperso.lina.univ-nantes.fr/~cdlh/ICML06/icml_gi_slides_6pp.pdf · 1 ICML 2006, Grammatical Inference 1 1 Colin de la Higuera,

20

11ICML 2006, Grammatical Inference115

Now try to merge {8,12} with {4,9,13}

aa

a

ab

a

X-={aa, ab, aaaa, ba}

1,3,6,7,15

2,5,10

4,9,13

8,12

11,14

b

b

11ICML 2006, Grammatical Inference116

This is OK and no more merge is possible so the algorithm halts.

aa

a

b

a

X-={aa, ab, aaaa, ba}

1,3,6,7,11,14,15

2,5,10

4,8,9,12,13

b

b

11ICML 2006, Grammatical Inference117

Definitions

• Let ≤ be the length-lexordering over Σ*

• Let Pref(L) be the set of all prefixes of strings in some language L.

11ICML 2006, Grammatical Inference118

Short prefixes

Sp(L)={u∈Pref(L): δ(q0,u)=δ(q0,v) ⇒ u≤v}

• There is one short prefix per useful state

0

1 2a

b

a

b b

aSp(L)={λ, a}

11ICML 2006, Grammatical Inference119

Kernel-sets

• N(L)={ua∈Pref(L): u∈Sp(L)}∪{λ}• There is an element in the Kernel-set for each useful transition

0

1 2a

b

a

b b

aN(L)={λ, a, b, ab}

12ICML 2006, Grammatical Inference120

A characteristic sample

• A sample is characteristic (for RPNI) if

–∀x∈Sp(L) ∃xu∈X+–∀x∈Sp(L), ∀y∈N(L),

δ(q0,x)≠δ(q0,y) ⇒∃z∈Σ*: xz∈X+∧yz∈X- ∨

xz∈X-∧yz∈X+

Page 21: Grammar Induction: •Laurent Miclet, Tim Oates, Jose ...pagesperso.lina.univ-nantes.fr/~cdlh/ICML06/icml_gi_slides_6pp.pdf · 1 ICML 2006, Grammatical Inference 1 1 Colin de la Higuera,

21

12ICML 2006, Grammatical Inference121

About characteristic samples• If you add more strings to a characteristic sample it still is characteristic;

• There can be many different characteristic samples;

• Change the ordering (or the exploring function in RPNI) and the characteristic sample will change.

12ICML 2006, Grammatical Inference122

Conclusion• RPNI identifies any regular language in the limit;

• RPNI works in polynomial time.

Complexity is in O(║X+║3.║X-║);• There are many significant variants of RPNI;

• RPNI can be extended to other classes of grammars.

12ICML 2006, Grammatical Inference123

Open problems

• RPNI’s complexity is not a tight upper bound. Find the correct complexity.

• The definition of the characteristic set is not tight either. Find a better definition.

12ICML 2006, Grammatical Inference124

AlgorithmsRPNI

K-Reversible

GRIDS

SEQUITUR

L*

12ICML 2006, Grammatical Inference125

4.2 The k-reversible languages• The class was proposed by Angluin(1982).

• The class is identifiable in the limit from text.

• The class is composed by regular languages that can be accepted by a DFA such that its reverse is deterministic with a look-ahead of k.

12ICML 2006, Grammatical Inference126

Let A=(Σ, Q, δ, I, F) be a NFA, we denote by AT=(Σ, Q, δT, F, I) the reversal automaton with:

δT(q,a)={q’∈Q: q∈δ(q’,a)}

Page 22: Grammar Induction: •Laurent Miclet, Tim Oates, Jose ...pagesperso.lina.univ-nantes.fr/~cdlh/ICML06/icml_gi_slides_6pp.pdf · 1 ICML 2006, Grammatical Inference 1 1 Colin de la Higuera,

22

12ICML 2006, Grammatical Inference127

0 1

3

b2

4

a

ba

a a a

0 1

3

b2

4

a

ba

a a a

A

AT

12ICML 2006, Grammatical Inference128

Some definitions

• u is a k-successor of q if │u│=k and δ(q,u)≠∅.

• u is a k-predecessor of q if │u│=k and δT(q,uT)≠∅.

• λ is 0-successor and 0-predecessor of any state.

12ICML 2006, Grammatical Inference129

0 1

3

b2

4b

a

a a a

A

• aa is a 2-successor of 0 and 1 but not of 3.

• a is a 1-successor of 3.

• aa is a 2-predecessor of 3 but not of 1.

a

13ICML 2006, Grammatical Inference130

A NFA is deterministic with look-ahead k iff ∀q,q’∈Q: q≠q’(q,q’∈I) ∨ (q,q’∈δ(q”,a))

(u is a k-successor of q) ∧(v is a k-successor of q’) ⇒ u≠v

13ICML 2006, Grammatical Inference131

Prohibited:

2

1

a

a

u

u

│u│=k

13ICML 2006, Grammatical Inference132

Example

This automaton is not deterministic with look-ahead 1 but is deterministic with look-ahead 2.

0 1

3

b2

4

a

ba

a a a

Page 23: Grammar Induction: •Laurent Miclet, Tim Oates, Jose ...pagesperso.lina.univ-nantes.fr/~cdlh/ICML06/icml_gi_slides_6pp.pdf · 1 ICML 2006, Grammatical Inference 1 1 Colin de la Higuera,

23

13ICML 2006, Grammatical Inference133

K-reversible automata• A is k-reversible if A is deterministic and AT is deterministic with look-ahead k.

• Example

0 1

b

2baa

b

0 1

b

2baa

bdeterministic deterministic with look-ahead 1

13ICML 2006, Grammatical Inference134

Violation of k-reversibility• Two states q, q’ violate the k-reversibility condition iff– they violate the deterministic condition: q,q’∈δ(q”,a);

or

– they violate the look-ahead condition: •q,q’∈F, ∃u∈Σk: u is k-predecessor of both;

•∃u∈Σk, δ(q,a)=δ(q’,a) and u is k-predecessor of both q and q’.

13ICML 2006, Grammatical Inference135

Learning k-reversible automata

• Key idea: the order in which the merges are performed does not matter!

• Just merge states that do not comply with the conditions for k-reversibility.

13ICML 2006, Grammatical Inference136

K-RL Algorithm (φk-RL)

Data: k∈ , X sample of a k-RL L

A=PTA(X)

While ∃q,q’ k-reversibility violators do

A=merge(A,q,q’)

13ICML 2006, Grammatical Inference137

Let X={a, aa, abba, abbbba}

a

λ ab abb

aa

abbbbabbb abbbba

abbaa

b b b b a

a

a

k=2

Violators, for u= ba

13ICML 2006, Grammatical Inference138

Let X={a, aa, abba, abbbba}

a

λ ab abb

aa

abbbbabbb

abbaa

b b b ba

a

a

k=2

Violators, for u= bb

Page 24: Grammar Induction: •Laurent Miclet, Tim Oates, Jose ...pagesperso.lina.univ-nantes.fr/~cdlh/ICML06/icml_gi_slides_6pp.pdf · 1 ICML 2006, Grammatical Inference 1 1 Colin de la Higuera,

24

13ICML 2006, Grammatical Inference139

Let X={a, aa, abba, abbbba}

a

λ ab abb

aa

abbb

abbaa

b b bb

a

a

k=2

14ICML 2006, Grammatical Inference140

Properties (1)• ∀k≥0, ∀X, φk-RL(X) is a k-reversible language.

• L(φk-RL(X)) is the smallest k-reversible language that contains X.

• The class Lk-RL is identifiable in the limit from text.

14ICML 2006, Grammatical Inference141

Properties (2)• Any regular language is k-reversible iff

(u1v)-1L ∩(u2v)-1L≠∅ and │v│=k

⇒(u1v)

-1L=(u2v)-1L

(if two strings are prefixes of a string of length at least k, then the strings are

Nerode-equivalent)

14ICML 2006, Grammatical Inference142

Properties (3)

• Lk-RL(X) ⊂ L(k+1)-RL(X)• Lk-TSS(X) ⊂ L(k-1)-RL(X)

14ICML 2006, Grammatical Inference143

Properties (4)

The time complexity is O(k║X║3).

The space complexity is O(║X║).

The algorithm is not incremental.

14ICML 2006, Grammatical Inference144

Properties (4) Polynomial aspects

• Polynomial characteristic sets

• Polynomial update time

• But not necessarily a polynomial number of mind changes

Page 25: Grammar Induction: •Laurent Miclet, Tim Oates, Jose ...pagesperso.lina.univ-nantes.fr/~cdlh/ICML06/icml_gi_slides_6pp.pdf · 1 ICML 2006, Grammatical Inference 1 1 Colin de la Higuera,

25

14ICML 2006, Grammatical Inference145

Extensions• Sakakibara built an extension for context-free grammars whose tree language is k-reversible

• Marion & Besombes propose an extension to tree languages.

• Different authors propose to learn these automata and then estimate the probabilities as an alternative to learning stochastic automata.

14ICML 2006, Grammatical Inference146

Exercises

• Construct a language L that is not k-reversible, ∀k≥0.

• Prove that the class of k-reversible languages is not in TxtEx.

• Run φk-RL on X={aa, aba, abb, abaaba, baaba} for k=0,1,2,3

14ICML 2006, Grammatical Inference147

Solution (idea)

• Lk={ai: i≤k}

• Then for each k: Lk is k-reversible but not k-1-reversible.

• And ULk = a*

• So there is an accumulation point…

14ICML 2006, Grammatical Inference148

AlgorithmsRPNI

K-Reversible

GRIDS

SEQUITUR

L*

14ICML 2006, Grammatical Inference149

4.3 Active Learning: learning DFA from membership and

equivalence queries: the L* algorithm

15ICML 2006, Grammatical Inference150

The classes C and H

• sets of examples

• representations of these sets

• the computation of L(x) (and h(x)) must take place in time polynomial in ⏐x⏐.

Page 26: Grammar Induction: •Laurent Miclet, Tim Oates, Jose ...pagesperso.lina.univ-nantes.fr/~cdlh/ICML06/icml_gi_slides_6pp.pdf · 1 ICML 2006, Grammatical Inference 1 1 Colin de la Higuera,

26

15ICML 2006, Grammatical Inference151

Correct learningA class C is identifiable with a polynomial number of queries of type T if there exists an algorithm φ that:

1) ∀L∈C identifies L with a polynomial number of queries of type T;

2) does each update in time polynomial in ⎪f⎪ and in Σ⎪xi⎪, {xi} counter-examples seen so far.

15ICML 2006, Grammatical Inference152

Algorithm L*• Angluin’s papers

• Some talks by Rivest

• Kearns and Vazirani

• Balcazar, Diaz, Gavaldà & Watanabe

15ICML 2006, Grammatical Inference153

Some references• Learning regular sets from queries and counter-examples, D. Angluin, Information and computation, 75, 87-106, 1987.

• Queries and Concept learning, D. Angluin, Machine Learning, 2, 319-342, 1988.

• Negative results for Equivalence Queries, D. Angluin, Machine Learning, 5, 121-150, 1990.

15ICML 2006, Grammatical Inference154

The Minimal Adequate Teacher

• You are allowed:

– strong equivalence queries;

– membership queries.

15ICML 2006, Grammatical Inference155

General idea of L*• find a consistent table (representing a DFA);

• submit it as an equivalence query;

• use counterexample to update the table;

• submit membership queries to make the table complete;

• Iterate.

15ICML 2006, Grammatical Inference156

An observation table

λ

λ

a

a

abaab

1 0

00

010001

Page 27: Grammar Induction: •Laurent Miclet, Tim Oates, Jose ...pagesperso.lina.univ-nantes.fr/~cdlh/ICML06/icml_gi_slides_6pp.pdf · 1 ICML 2006, Grammatical Inference 1 1 Colin de la Higuera,

27

15ICML 2006, Grammatical Inference157

The states (S) or test set

The transitions (T)

The experiments (E)

λ

λ

a

a

abaab

1 0

00

010001

15ICML 2006, Grammatical Inference158

Meaning

δ(q0, λ. λ)∈F⇔

λ ∈L

λ

λ

a

a

abaab

1 0

00

010001

15ICML 2006, Grammatical Inference159

δ(q0, ab.a)∉ F⇔

aba ∉ L

λ

λ

a

a

abaab

1 0

00

010001

16ICML 2006, Grammatical Inference160

Equivalent prefixes

These two rows are equal,

hence

δ(q0,λ)= δ(q0,ab)

λ

λ

a

a

abaab

1 0

00

010001

16ICML 2006, Grammatical Inference161

Building a DFA from a table

λ

λ

a

a

abaab

1 0

00

010001

λ

a

a

16ICML 2006, Grammatical Inference162

λ

λ

a

a

abaab

1 0

00

010001

λ

a

a

b

a

b

Page 28: Grammar Induction: •Laurent Miclet, Tim Oates, Jose ...pagesperso.lina.univ-nantes.fr/~cdlh/ICML06/icml_gi_slides_6pp.pdf · 1 ICML 2006, Grammatical Inference 1 1 Colin de la Higuera,

28

16ICML 2006, Grammatical Inference163

λ

λ

a

a

abaab

1 0

00

010001

λ

a

a

b

a

b

Some rules

This set is prefix-closed

This set is suffix-closed

SΣ\S=T

16ICML 2006, Grammatical Inference164

An incomplete table

λ

λ

a

a

abaab

1 0

0

01001

λ

a

a

b

a

b

16ICML 2006, Grammatical Inference165

Good idea

We can complete the table by making membership queries...

u

v

?uv∈L ?

Membership query:

16ICML 2006, Grammatical Inference166

A table is

closed if any row of Tcorresponds to some row in S

λ

λ

a

a

abaab

1 0

00

011001

Not closed

16ICML 2006, Grammatical Inference167

And a table that is not closed

λ

λ

a

a

abaab

1 0

00

011001

λ

a

a

b

a

b

?

16ICML 2006, Grammatical Inference168

What do we do when we have a table that is not closed?

• Let s be the row (of T) that does not appear in S.

• Add s to S, and ∀a∈Σ sa to T.

Page 29: Grammar Induction: •Laurent Miclet, Tim Oates, Jose ...pagesperso.lina.univ-nantes.fr/~cdlh/ICML06/icml_gi_slides_6pp.pdf · 1 ICML 2006, Grammatical Inference 1 1 Colin de la Higuera,

29

16ICML 2006, Grammatical Inference169

An inconsistent table

λ

λ a

abaa

1 0a

b00

00

0101

bbba 01

00

Are a and bequivalent?

17ICML 2006, Grammatical Inference170

A table is consistent if

Every equivalent pair of rows in H remains equivalent in Safter appending any symbol

row(s1)=row(s2)

⇒∀a∈Σ, row(s1a)=row(s2a)

17ICML 2006, Grammatical Inference171

What do we do when we have an inconsistent table?

Let a∈Σ be such that row(s1)=row(s2) butrow(s1a)≠row(s2a)

• If row(s1a)≠row(s2a), it is so for experiment e

• Then add experiment ae to the table

17ICML 2006, Grammatical Inference172

What do we do when we have a closed and consistent table ?

• We build the corresponding DFA

• We make an equivalence query!!!

17ICML 2006, Grammatical Inference173

What do we do if we get a counter-example?

• Let u be this counter-example

• ∀w∈Pref(u) do– add w to S

–∀a∈Σ, such that wa∉Pref(u) add wa to T

17ICML 2006, Grammatical Inference174

Run of the algorithm

λ

λ

a

b

1

1

1 Table is now closed

and consistentλ

b

a

Page 30: Grammar Induction: •Laurent Miclet, Tim Oates, Jose ...pagesperso.lina.univ-nantes.fr/~cdlh/ICML06/icml_gi_slides_6pp.pdf · 1 ICML 2006, Grammatical Inference 1 1 Colin de la Higuera,

30

17ICML 2006, Grammatical Inference175

An equivalence query is made!

λ

b

a

Counter example baa is returned

17ICML 2006, Grammatical Inference176

λ

λ

a

b1

1

0baaba

baaa

bbbab

baab

1

01

1

11

Not consistent

Because of

17ICML 2006, Grammatical Inference177

λ

λ a

a

b1 1

1

0 0 baaba

baaa

bbbab

baab

1 1

0 1

1 0

1 1

Table is now closed and consistent

λ ba

baa

a

b

a

b b

a

0

0 0

1 1

17ICML 2006, Grammatical Inference178

Proof of the algorithm

Sketch only

Understanding the proof is important for further algorithms

Balcazar et al. is a good place for that.

17ICML 2006, Grammatical Inference179

Termination / Correctness• For every regular language there is a unique minimal DFA that recognizes it.

• Given a closed and consistent table, one can generate a consistent DFA.

• A DFA consistent with a table has at least as many states as different rows in S.

• If the algorithm has built a table with n different rows in S, then it is the target.

18ICML 2006, Grammatical Inference180

Finiteness

• Each closure failure adds one different row to S.

• Each inconsistency failure adds one experiment, which also creates a new row in S.

• Each counterexample adds one different row to S.

Page 31: Grammar Induction: •Laurent Miclet, Tim Oates, Jose ...pagesperso.lina.univ-nantes.fr/~cdlh/ICML06/icml_gi_slides_6pp.pdf · 1 ICML 2006, Grammatical Inference 1 1 Colin de la Higuera,

31

18ICML 2006, Grammatical Inference181

Polynomial

• |E| ≤ n

• at most n-1 equivalence queries

• |membership queries| ≤ n(n-1)m where m is the length of the longest counter-example returned by the oracle

18ICML 2006, Grammatical Inference182

Conclusion• With an MAT you can learn DFA

– but also a variety of other classes of grammars;

– it is difficult to see how powerful is really an MAT;

– probably as much as PAC learning.

– Easy to find a class, a set of queries and provide and algorithm that learns with them;

– more difficult for it to be meaningful.

• Discussion: why are these queries meaningful?

18ICML 2006, Grammatical Inference183

AlgorithmsRPNI

K-Reversible

GRIDS

SEQUITUR

L*

18ICML 2006, Grammatical Inference184

4.4 SEQUITUR

(http://sequence.rutgers.edu/sequitur/)

(Neville Manning & Witten, 97)

Idea: construct a CF grammar from a very long string w, such that L(G)={w}– No generalization

– Linear time (+/-)

– Good compression rates

18ICML 2006, Grammatical Inference185

Principle

The grammar with respect to the string:

– Each rule has to be used at least twice;

– There can be no sub-string of length 2 that appears twice.

18ICML 2006, Grammatical Inference186

ExamplesS→abcdbc

S→AbAabA →aa

S →aAdAA →bc

S→aabaaab

S→AaAA →aab

Page 32: Grammar Induction: •Laurent Miclet, Tim Oates, Jose ...pagesperso.lina.univ-nantes.fr/~cdlh/ICML06/icml_gi_slides_6pp.pdf · 1 ICML 2006, Grammatical Inference 1 1 Colin de la Higuera,

32

18ICML 2006, Grammatical Inference187

abcabdabcabd

18ICML 2006, Grammatical Inference188

In the beginning, God created the heavens and the earth.

And the earth was without form, and void; and darkness was upon the face of the deep. And the Spirit of God moved upon the face of the waters.

And God said, Let there be light: and there was light.

And God saw the light, that it was good: and God divided the light from the darkness.

And God called the light Day, and the darkness he called Night. And the evening and the morning were the first day.

And God said, Let there be a firmament in the midst of the waters, and let it divide the waters from the waters.

And God made the firmament, and divided the waters which were under the firmament from the waters which were above the firmament: and it was so.

And God called the firmament Heaven. And the evening and the morning were the second day.

18ICML 2006, Grammatical Inference189 19ICML 2006, Grammatical Inference

190

• appending a symbol to rule S;

• using an existing rule;

• creating a new rule;

• and deleting a rule.

Sequitur options

19ICML 2006, Grammatical Inference191

Results

On text:

– 2.82 bpc

– compress 3.46 bpc

– gzip 3.25 bpc

– PPMC 2.52 bpc

19ICML 2006, Grammatical Inference192

AlgorithmsRPNI

K-Reversible

GRIDS

SEQUITUR

L*

Page 33: Grammar Induction: •Laurent Miclet, Tim Oates, Jose ...pagesperso.lina.univ-nantes.fr/~cdlh/ICML06/icml_gi_slides_6pp.pdf · 1 ICML 2006, Grammatical Inference 1 1 Colin de la Higuera,

33

19ICML 2006, Grammatical Inference193

4.5 Using a simplicity bias (Langley & Stromsten, 00)

Based on algorithm GRIDS (Wolff, 82)

Main characteristics:– MDL principle;

– Not characterizable;

– Not tested on large benchmarks.

19ICML 2006, Grammatical Inference194

Two learning operatorsCreation of non terminals and rules

NP →ART ADJ NOUNNP →ART ADJ ADJ NOUN

NP →ART AP1NP →ART ADJ AP1AP1 → ADJ NOUN

19ICML 2006, Grammatical Inference195

Merging two non terminals

NP →ART AP1NP →ART AP2AP1 → ADJ NOUNAP2 → ADJ AP1

NP →ART AP1AP1 → ADJ NOUNAP1 → ADJ AP1

19ICML 2006, Grammatical Inference196

• Scoring function: MDL

principle: ⎪G⎪+Σw∈T ⎪d(w)⎪

• Algorithm:

– find best merge that improves current grammar

– if no such merge exists, find best creation

– halt when no improvement

19ICML 2006, Grammatical Inference197

Results

• On subsets of English grammars (15 rules, 8 non terminals, 9 terminals): 120 sentences to converge

• on (ab)*: all (15) strings of length ≤ 30

• on Dyck1: all (65) strings of length ≤ 12

19ICML 2006, Grammatical Inference198

AlgorithmsRPNI

K-Reversible

GRIDS

SEQUITUR

L*

Page 34: Grammar Induction: •Laurent Miclet, Tim Oates, Jose ...pagesperso.lina.univ-nantes.fr/~cdlh/ICML06/icml_gi_slides_6pp.pdf · 1 ICML 2006, Grammatical Inference 1 1 Colin de la Higuera,

34

19ICML 2006, Grammatical Inference199

5 Open questions and conclusions• dealing with noise

• classes of languages that adequately mix Chomsky’s hierarchy with edit distance

• stochastic context-free grammars

• polynomial learning from text

• learning POMDPs

• fast algorithms…