Machine Learning Methods for Discovering ... - macs.hw.ac.ukek19/STP-Dundee.pdf · Katya (Dundee)...

51
Machine Learning Methods for Discovering Proof Tactics in Automated Proofs (Work in Progress) Ekaterina Komendantskaya School of Computing, University of Dundee STP’11, 17 June 2011 Katya (Dundee) Machine Learning Methods for Discovering Proof Tactics in Automated Proofs (Work in Pr ARW’11 1 / 35

Transcript of Machine Learning Methods for Discovering ... - macs.hw.ac.ukek19/STP-Dundee.pdf · Katya (Dundee)...

Page 1: Machine Learning Methods for Discovering ... - macs.hw.ac.ukek19/STP-Dundee.pdf · Katya (Dundee) Machine Learning Methods for Discovering Proof Tactics in Automated Proofs (Work

Machine Learning Methods for Discovering Proof Tacticsin Automated Proofs (Work in Progress)

Ekaterina Komendantskaya

School of Computing, University of Dundee

STP’11,17 June 2011

Katya (Dundee) Machine Learning Methods for Discovering Proof Tactics in Automated Proofs (Work in Progress)ARW’11 1 / 35

Page 2: Machine Learning Methods for Discovering ... - macs.hw.ac.ukek19/STP-Dundee.pdf · Katya (Dundee) Machine Learning Methods for Discovering Proof Tactics in Automated Proofs (Work

Outline

1 Motivation and wide perspective

2 Machine Learning sequential proof tactics

3 Pattern-recognition in Coinductive Proof trees

4 Conclusions

Katya (Dundee) Machine Learning Methods for Discovering Proof Tactics in Automated Proofs (Work in Progress)ARW’11 2 / 35

Page 3: Machine Learning Methods for Discovering ... - macs.hw.ac.ukek19/STP-Dundee.pdf · Katya (Dundee) Machine Learning Methods for Discovering Proof Tactics in Automated Proofs (Work

Outline

1 Motivation and wide perspective

2 Machine Learning sequential proof tactics

3 Pattern-recognition in Coinductive Proof trees

4 Conclusions

Katya (Dundee) Machine Learning Methods for Discovering Proof Tactics in Automated Proofs (Work in Progress)ARW’11 2 / 35

Page 4: Machine Learning Methods for Discovering ... - macs.hw.ac.ukek19/STP-Dundee.pdf · Katya (Dundee) Machine Learning Methods for Discovering Proof Tactics in Automated Proofs (Work

Outline

1 Motivation and wide perspective

2 Machine Learning sequential proof tactics

3 Pattern-recognition in Coinductive Proof trees

4 Conclusions

Katya (Dundee) Machine Learning Methods for Discovering Proof Tactics in Automated Proofs (Work in Progress)ARW’11 2 / 35

Page 5: Machine Learning Methods for Discovering ... - macs.hw.ac.ukek19/STP-Dundee.pdf · Katya (Dundee) Machine Learning Methods for Discovering Proof Tactics in Automated Proofs (Work

Outline

1 Motivation and wide perspective

2 Machine Learning sequential proof tactics

3 Pattern-recognition in Coinductive Proof trees

4 Conclusions

Katya (Dundee) Machine Learning Methods for Discovering Proof Tactics in Automated Proofs (Work in Progress)ARW’11 2 / 35

Page 6: Machine Learning Methods for Discovering ... - macs.hw.ac.ukek19/STP-Dundee.pdf · Katya (Dundee) Machine Learning Methods for Discovering Proof Tactics in Automated Proofs (Work

Motivation and wide perspective

Proof in Coq - Demo.

Not all proofs can be done automatically. That is, automatic tacticsfail.

In bigger industrial proofs there might be thousands of lemmas andtheorems needing proofs, with about 5-20% needing programmer’sintervention.

Can statistical methods help us to analyse why these fail (The modelsof “Why”, Cliff Jones)? and perhaps even suggest combination oftactics to try instead?

Note on discovering “Why?s”: my experience with Coq in INRIA wasthat the experts often justify those “Why’s” statistically rather thanconceptually. (“Coq Art”)

Note on combinations of tactics, like intro, induction, simpl, etc.in Higher-order interactive provers. Even very complicated proofs useabout 50-100 tactics only; BUT their combinations can be veryclever. Again, is there room for statistical analysis?

Katya (Dundee) Machine Learning Methods for Discovering Proof Tactics in Automated Proofs (Work in Progress)ARW’11 3 / 35

Page 7: Machine Learning Methods for Discovering ... - macs.hw.ac.ukek19/STP-Dundee.pdf · Katya (Dundee) Machine Learning Methods for Discovering Proof Tactics in Automated Proofs (Work

Motivation and wide perspective

Overall feeling

These are hard questions to solve.

If solved, may change the way we look at proofs (automated andmanual)...

May enrich the methods of machine learning and proof theory.

Katya (Dundee) Machine Learning Methods for Discovering Proof Tactics in Automated Proofs (Work in Progress)ARW’11 4 / 35

Page 8: Machine Learning Methods for Discovering ... - macs.hw.ac.ukek19/STP-Dundee.pdf · Katya (Dundee) Machine Learning Methods for Discovering Proof Tactics in Automated Proofs (Work

Motivation and wide perspective

Lessons learnt:

I have tried to implement several logic algorithms in Neural nets

Semantic operators for first-order logic programs and many-valuedlogic programs;

First-order Unification algorithm;

First-order term-rewriting;

Inductive definitions akin inductive dependent types.

Overall conclusion

It is inefficient to apply statistical methods to logic algorithms “as theyare” — because statistical methods cannot compete with logical methodson the same grounds. Where can they compete?

Katya (Dundee) Machine Learning Methods for Discovering Proof Tactics in Automated Proofs (Work in Progress)ARW’11 5 / 35

Page 9: Machine Learning Methods for Discovering ... - macs.hw.ac.ukek19/STP-Dundee.pdf · Katya (Dundee) Machine Learning Methods for Discovering Proof Tactics in Automated Proofs (Work

Motivation and wide perspective

Why is machine learning UN-suitable for Formal methods:

Many logic algorithms have a precise, rather than statistical nature.

Example

Two formulae list(x) and list(nil) are unifiable: x/nil. We meanexactly this, and do not want it to be substituted by some approximatesuch as nol. (Although humans would tolerate this mis-spelling had itappeared in a written text...)

Many important logic algorithms are sequential, e.g. unification.

Example

If I have a goal: list(cons(x,y)) ∧ list(x), my proof will neversucceed — x will get substituted by some nat term, e.g. O or S(O), whichwill make the second formula invalid. Note that the proof would havesucceed had it been concurrent.

Katya (Dundee) Machine Learning Methods for Discovering Proof Tactics in Automated Proofs (Work in Progress)ARW’11 6 / 35

Page 10: Machine Learning Methods for Discovering ... - macs.hw.ac.ukek19/STP-Dundee.pdf · Katya (Dundee) Machine Learning Methods for Discovering Proof Tactics in Automated Proofs (Work

Motivation and wide perspective

Why is machine learning UN-suitable for Formal methods:

Many logic algorithms have a precise, rather than statistical nature.

Example

Two formulae list(x) and list(nil) are unifiable: x/nil. We meanexactly this, and do not want it to be substituted by some approximatesuch as nol. (Although humans would tolerate this mis-spelling had itappeared in a written text...)

Many important logic algorithms are sequential, e.g. unification.

Example

If I have a goal: list(cons(x,y)) ∧ list(x), my proof will neversucceed — x will get substituted by some nat term, e.g. O or S(O), whichwill make the second formula invalid. Note that the proof would havesucceed had it been concurrent.

Katya (Dundee) Machine Learning Methods for Discovering Proof Tactics in Automated Proofs (Work in Progress)ARW’11 6 / 35

Page 11: Machine Learning Methods for Discovering ... - macs.hw.ac.ukek19/STP-Dundee.pdf · Katya (Dundee) Machine Learning Methods for Discovering Proof Tactics in Automated Proofs (Work

Motivation and wide perspective

Machine learning may yet be useful in Logic...

...on several conditions

we either use concurrent algorithms, or re-formulate existingalgorithms into concurrent form,

we are clear about what sort of statistical features in automatedproofs we wish to learn.

Boom in research into coalgebraic methods

... by this I mean coalgebra in either category theory sense, logicsense (e.g modal logic), or computational sense (as in ITPs).

in most cases - considers concurrent models; e.g. Milner’s CCS andπ-calculus;

because it is devoted to infinite processes/proofs/objects/. . . — it isbased on the idea of repeating patterns (rather than final answers orterminating computations). E.g., the notion of productiveness. Thesehave good chances of yielding statistical analysis.

Katya (Dundee) Machine Learning Methods for Discovering Proof Tactics in Automated Proofs (Work in Progress)ARW’11 7 / 35

Page 12: Machine Learning Methods for Discovering ... - macs.hw.ac.ukek19/STP-Dundee.pdf · Katya (Dundee) Machine Learning Methods for Discovering Proof Tactics in Automated Proofs (Work

Motivation and wide perspective

Machine learning may yet be useful in Logic...

...on several conditions

we either use concurrent algorithms, or re-formulate existingalgorithms into concurrent form,

we are clear about what sort of statistical features in automatedproofs we wish to learn.

Boom in research into coalgebraic methods

... by this I mean coalgebra in either category theory sense, logicsense (e.g modal logic), or computational sense (as in ITPs).

in most cases - considers concurrent models; e.g. Milner’s CCS andπ-calculus;

because it is devoted to infinite processes/proofs/objects/. . . — it isbased on the idea of repeating patterns (rather than final answers orterminating computations). E.g., the notion of productiveness. Thesehave good chances of yielding statistical analysis.Katya (Dundee) Machine Learning Methods for Discovering Proof Tactics in Automated Proofs (Work in Progress)ARW’11 7 / 35

Page 13: Machine Learning Methods for Discovering ... - macs.hw.ac.ukek19/STP-Dundee.pdf · Katya (Dundee) Machine Learning Methods for Discovering Proof Tactics in Automated Proofs (Work

Motivation and wide perspective

Plan for the talk

In this talk, I will try to demonstrate this idea:

On example of logic programs:

I will show sequential proofs analysed by machine learning methods;

Show how they can be transformed into coinductive proofs;

Compare their efficiency

Hopefully — will persuade you that this example proofs the concept.

Logic programming is a simplified model compared to modern ITPs— but it is elegant, simple and yet we can learn a lot from it;

The coinductive proofs part of my talk is based on joint work withJ.Power, published in CALCO’11, CSL’11 - available on my webpage.

Katya (Dundee) Machine Learning Methods for Discovering Proof Tactics in Automated Proofs (Work in Progress)ARW’11 8 / 35

Page 14: Machine Learning Methods for Discovering ... - macs.hw.ac.ukek19/STP-Dundee.pdf · Katya (Dundee) Machine Learning Methods for Discovering Proof Tactics in Automated Proofs (Work

Motivation and wide perspective

Plan for the talk

In this talk, I will try to demonstrate this idea:

On example of logic programs:

I will show sequential proofs analysed by machine learning methods;

Show how they can be transformed into coinductive proofs;

Compare their efficiency

Hopefully — will persuade you that this example proofs the concept.

Logic programming is a simplified model compared to modern ITPs— but it is elegant, simple and yet we can learn a lot from it;

The coinductive proofs part of my talk is based on joint work withJ.Power, published in CALCO’11, CSL’11 - available on my webpage.

Katya (Dundee) Machine Learning Methods for Discovering Proof Tactics in Automated Proofs (Work in Progress)ARW’11 8 / 35

Page 15: Machine Learning Methods for Discovering ... - macs.hw.ac.ukek19/STP-Dundee.pdf · Katya (Dundee) Machine Learning Methods for Discovering Proof Tactics in Automated Proofs (Work

Machine Learning sequential proof tactics

First-order syntax

We fix the alphabet A to consist of

Signature Σ:

constant symbols a, b, c , possibly with finite subscripts;function symbols f , g , h, f1 . . ., with arities;

variables x , y , z , x1 . . .;

predicate symbols P,Q,R,P1,P2 . . ., with arities;

Term:t = a | x | f (t)

Atomic Formula: At = P(t1, . . . , tn), where P is a predicate symbol ofarity n and ti is a term.

Example

Signature: {O,S} - for natural numbers; {a,b,c} for a graph with 3 nodes.Atoms: nat(x), edge(a,b), etc...

Katya (Dundee) Machine Learning Methods for Discovering Proof Tactics in Automated Proofs (Work in Progress)ARW’11 9 / 35

Page 16: Machine Learning Methods for Discovering ... - macs.hw.ac.ukek19/STP-Dundee.pdf · Katya (Dundee) Machine Learning Methods for Discovering Proof Tactics in Automated Proofs (Work

Machine Learning sequential proof tactics

Logic programs

A first-order logic program consists of a finite set of clauses of the form

A← A1, . . . ,An

where A and the Ai ’s are atomic formulae, typically containing freevariables; and A1, . . . ,An is to mean the conjunction of the Ai ’s.

Definition

Let a goal G be ← A1, . . . ,Am, . . . ,Ak and a clause C beA← B1, . . . ,Bq. Then G ′ is derived from G and C using mgu θ if thefollowing conditions hold:

• Am is an atom, called the selected atom, in G .

• θ is an mgu of Am and A.

• G ′ is the goal ← (A1, . . . ,Am−1,B1, . . . ,Bq,Am+1, . . . ,Ak)θ.

Katya (Dundee) Machine Learning Methods for Discovering Proof Tactics in Automated Proofs (Work in Progress)ARW’11 10 / 35

Page 17: Machine Learning Methods for Discovering ... - macs.hw.ac.ukek19/STP-Dundee.pdf · Katya (Dundee) Machine Learning Methods for Discovering Proof Tactics in Automated Proofs (Work

Machine Learning sequential proof tactics

Logic program: Graph connectivity

Example

connected(x,x) ←connected(x,y) ← edge(x,z),

connected(z,y).

edge(a,b) ←edge(b,c) ←edge(c,a) ←

a

b c

Katya (Dundee) Machine Learning Methods for Discovering Proof Tactics in Automated Proofs (Work in Progress)ARW’11 11 / 35

Page 18: Machine Learning Methods for Discovering ... - macs.hw.ac.ukek19/STP-Dundee.pdf · Katya (Dundee) Machine Learning Methods for Discovering Proof Tactics in Automated Proofs (Work

Machine Learning sequential proof tactics

Example: Goal ← connected(a, c).

Example

connected(x,x) ←connected(x,y) ← edge(x,z),

connected(z,y)

edge(a,b) ←edge(b,c) ←edge(c,a) ←

← connected(a, c)

← edge(a, x), connected(x, c)

Katya (Dundee) Machine Learning Methods for Discovering Proof Tactics in Automated Proofs (Work in Progress)ARW’11 12 / 35

Page 19: Machine Learning Methods for Discovering ... - macs.hw.ac.ukek19/STP-Dundee.pdf · Katya (Dundee) Machine Learning Methods for Discovering Proof Tactics in Automated Proofs (Work

Machine Learning sequential proof tactics

Example: Goal ← connected(a, c).

Example

connected(x,x) ←connected(x,y) ← edge(x,z),

connected(z,y)

edge(a,b) ←edge(b,c) ←edge(c,a) ←

← connected(a, c)

← edge(a, x), connected(x, c)

← connected(b, c)

Katya (Dundee) Machine Learning Methods for Discovering Proof Tactics in Automated Proofs (Work in Progress)ARW’11 13 / 35

Page 20: Machine Learning Methods for Discovering ... - macs.hw.ac.ukek19/STP-Dundee.pdf · Katya (Dundee) Machine Learning Methods for Discovering Proof Tactics in Automated Proofs (Work

Machine Learning sequential proof tactics

Example: Goal ← connected(a, c).

Example

connected(x,x) ←connected(x,y) ← edge(x,z),

connected(z,y)

edge(a,b) ←edge(b,c) ←edge(c,a) ←

← connected(a, c)

← edge(a, b), connected(b, c)

← connected(b, c)

← edge(b, x), connected(x, c)

Katya (Dundee) Machine Learning Methods for Discovering Proof Tactics in Automated Proofs (Work in Progress)ARW’11 14 / 35

Page 21: Machine Learning Methods for Discovering ... - macs.hw.ac.ukek19/STP-Dundee.pdf · Katya (Dundee) Machine Learning Methods for Discovering Proof Tactics in Automated Proofs (Work

Machine Learning sequential proof tactics

Example: Goal ← connected(a, c).

Example

connected(x,x) ←connected(x,y) ← edge(x,z),

connected(z,y).

edge(a,b) ←edge(b,c) ←edge(c,a) ←

← connected(a, c)

← edge(a, b), connected(b, c)

← connected(b, c)

← edge(b, x), connected(x, c)

← connected(x, c)

Katya (Dundee) Machine Learning Methods for Discovering Proof Tactics in Automated Proofs (Work in Progress)ARW’11 15 / 35

Page 22: Machine Learning Methods for Discovering ... - macs.hw.ac.ukek19/STP-Dundee.pdf · Katya (Dundee) Machine Learning Methods for Discovering Proof Tactics in Automated Proofs (Work

Machine Learning sequential proof tactics

Example: Goal ← connected(a, c).

Example

connected(x,x) ←connected(x,y) ← edge(x,z),

connected(z,y).

edge(a,b) ←edge(b,c) ←edge(c,a) ←

← connected(a, c)

← edge(a, b), connected(b, c)

← connected(b, c)

← edge(b, x), connected(x, c)

← connected(x, c)

Katya (Dundee) Machine Learning Methods for Discovering Proof Tactics in Automated Proofs (Work in Progress)ARW’11 16 / 35

Page 23: Machine Learning Methods for Discovering ... - macs.hw.ac.ukek19/STP-Dundee.pdf · Katya (Dundee) Machine Learning Methods for Discovering Proof Tactics in Automated Proofs (Work

Machine Learning sequential proof tactics

Recursion and Corecursion in Logic Programming

Example

nat(0) ←nat(s(x)) ← nat(x)

list(nil) ←list(cons x y) ← nat(x), list(y)

Example

bit(0) ←bit(1) ←

stream(cons (x,y)) ← bit(x), stream(y)

Katya (Dundee) Machine Learning Methods for Discovering Proof Tactics in Automated Proofs (Work in Progress)ARW’11 17 / 35

Page 24: Machine Learning Methods for Discovering ... - macs.hw.ac.ukek19/STP-Dundee.pdf · Katya (Dundee) Machine Learning Methods for Discovering Proof Tactics in Automated Proofs (Work

Machine Learning sequential proof tactics

Machine-learn patterns of sequential proofs

This example mimics similar research done for ITPs in the PhD thesis ofHazel Duncan, University of Edinburgh.

Example

1. nat(0)←2. nat(s(x))← nat(x)

3. list(nil)←4. list(cons x y)← nat(x), list(y)

Given correct and incorrect sequences of the numbers 1, 2, 3, 4, 5, 6 howlikely is it that we can train a neural network to recognise correct andincorrect proofs?

Katya (Dundee) Machine Learning Methods for Discovering Proof Tactics in Automated Proofs (Work in Progress)ARW’11 18 / 35

Page 25: Machine Learning Methods for Discovering ... - macs.hw.ac.ukek19/STP-Dundee.pdf · Katya (Dundee) Machine Learning Methods for Discovering Proof Tactics in Automated Proofs (Work

Machine Learning sequential proof tactics

Example

Correct

4, 1, 4, 1, 5.

list(cons(x, cons(y, x)))

nat(x), list(cons(y, x))

nat(y), list(0)

list(0)

Incorrect

4, 1, 4, 1, 6.

list(cons(x, cons(y, x)))

nat(x), list(cons(y, x))

nat(y), list(0)

list(0)

Katya (Dundee) Machine Learning Methods for Discovering Proof Tactics in Automated Proofs (Work in Progress)ARW’11 19 / 35

Page 26: Machine Learning Methods for Discovering ... - macs.hw.ac.ukek19/STP-Dundee.pdf · Katya (Dundee) Machine Learning Methods for Discovering Proof Tactics in Automated Proofs (Work

Machine Learning sequential proof tactics

Example

Correct

4, 1, 4, 1, 5.

list(cons(x, cons(y, x)))

nat(x), list(cons(y, x))

nat(y), list(0)

list(0)

Incorrect

4, 1, 4, 1, 6.

list(cons(x, cons(y, x)))

nat(x), list(cons(y, x))

nat(y), list(0)

list(0)

Katya (Dundee) Machine Learning Methods for Discovering Proof Tactics in Automated Proofs (Work in Progress)ARW’11 19 / 35

Page 27: Machine Learning Methods for Discovering ... - macs.hw.ac.ukek19/STP-Dundee.pdf · Katya (Dundee) Machine Learning Methods for Discovering Proof Tactics in Automated Proofs (Work

Machine Learning sequential proof tactics

Example 2

Correct

4, 1, 4, 1, 5.

list(cons(x, cons(y, x)))

nat(x), list(cons(y, x))

nat(y), list(0)

list(0)

Incorrect

4, 1, 4, 1, 5.

list(cons(x, cons(y, z)))

nat(x), list(cons(y, x))

nat(y), list(z)

list(z)

Conclusion

If the size of the training data set (= set of the examples used fortraining) is big and representative, derivations with “tactics” 4, 1, 4, 1, 5are equally likely to be correct and incorrect.

Katya (Dundee) Machine Learning Methods for Discovering Proof Tactics in Automated Proofs (Work in Progress)ARW’11 20 / 35

Page 28: Machine Learning Methods for Discovering ... - macs.hw.ac.ukek19/STP-Dundee.pdf · Katya (Dundee) Machine Learning Methods for Discovering Proof Tactics in Automated Proofs (Work

Machine Learning sequential proof tactics

Example 2

Correct

4, 1, 4, 1, 5.

list(cons(x, cons(y, x)))

nat(x), list(cons(y, x))

nat(y), list(0)

list(0)

Incorrect

4, 1, 4, 1, 5.

list(cons(x, cons(y, z)))

nat(x), list(cons(y, x))

nat(y), list(z)

list(z)

Conclusion

If the size of the training data set (= set of the examples used fortraining) is big and representative, derivations with “tactics” 4, 1, 4, 1, 5are equally likely to be correct and incorrect.

Katya (Dundee) Machine Learning Methods for Discovering Proof Tactics in Automated Proofs (Work in Progress)ARW’11 20 / 35

Page 29: Machine Learning Methods for Discovering ... - macs.hw.ac.ukek19/STP-Dundee.pdf · Katya (Dundee) Machine Learning Methods for Discovering Proof Tactics in Automated Proofs (Work

Machine Learning sequential proof tactics

What are the coinductive trees?

Lesson learnt

The sequence of deductive rules alone does not help; and actuallyhides the structure of the proof.

We need more “structural” representation of proofs.

Coinductive trees:

They arose from coalgebraic semantics for derivations in logicprograms, [Komendantskaya,Power CALCO’2011].

They offer a proof method for recursive and corecursive logicprograms.

They also allow for concurrency.

They offer very structured approach to automated proofs, as we willsee shortly.

Katya (Dundee) Machine Learning Methods for Discovering Proof Tactics in Automated Proofs (Work in Progress)ARW’11 21 / 35

Page 30: Machine Learning Methods for Discovering ... - macs.hw.ac.ukek19/STP-Dundee.pdf · Katya (Dundee) Machine Learning Methods for Discovering Proof Tactics in Automated Proofs (Work

Pattern-recognition in Coinductive Proof trees

Algebraic and coalgebraic semantics for LP

FiniteSLD-derivations

Least fixedpoint of TP

Algebraicfibrationalsemantics

Finite and InfiniteSLD-derivations

Greatest fixedpoint of TP

CoalgebraicfibrationalsemanticsCC

��

[[

��

[[ CC

��

Katya (Dundee) Machine Learning Methods for Discovering Proof Tactics in Automated Proofs (Work in Progress)ARW’11 22 / 35

Page 31: Machine Learning Methods for Discovering ... - macs.hw.ac.ukek19/STP-Dundee.pdf · Katya (Dundee) Machine Learning Methods for Discovering Proof Tactics in Automated Proofs (Work

Pattern-recognition in Coinductive Proof trees

Definition of coinductive trees

Definition

Let P be a logic program and G =← A be an atomic goal. Thecoinductive derivation tree for A is a possibly infinite tree T satisfying thefollowing properties.

A is the root of T .

Each node in T is either an and-node or an or-node.

Each or-node is given by •.Each and-node is an atom.

For every and-node A′ occurring in T , there exist exactly m > 0distinct clauses C1, . . . ,Cm in P (a clause Ci has the formBi ← B i

1, . . . ,Bini

, for some ni ), such that A′ = B1θ1 = ... = Bmθm,for some substitutions θ1, . . . , θm, then A′ has exactly m childrengiven by or-nodes, such that, for every i ∈ m, the ith or-node has nchildren given by and-nodes B i

1θi , . . . ,Biniθi .

Katya (Dundee) Machine Learning Methods for Discovering Proof Tactics in Automated Proofs (Work in Progress)ARW’11 23 / 35

Page 32: Machine Learning Methods for Discovering ... - macs.hw.ac.ukek19/STP-Dundee.pdf · Katya (Dundee) Machine Learning Methods for Discovering Proof Tactics in Automated Proofs (Work

Pattern-recognition in Coinductive Proof trees

Examples of first-order coinductive trees:

Correct

list(cons(x, cons(y, x)))

nat(x) list(cons(y, x))

nat(y) list(x)

Incorrect

list(cons(x, cons(y, x)))

nat(x)

list(cons(y, x))

nat(y)

list(x)

Katya (Dundee) Machine Learning Methods for Discovering Proof Tactics in Automated Proofs (Work in Progress)ARW’11 24 / 35

Page 33: Machine Learning Methods for Discovering ... - macs.hw.ac.ukek19/STP-Dundee.pdf · Katya (Dundee) Machine Learning Methods for Discovering Proof Tactics in Automated Proofs (Work

Pattern-recognition in Coinductive Proof trees

Examples of first-order coinductive trees:

Correct

list(cons(x, cons(y, x)))

nat(x) list(cons(y, x))

nat(y) list(x)

Incorrect

list(cons(x, cons(y, x)))

nat(x)

list(cons(y, x))

nat(y)

list(x)

Katya (Dundee) Machine Learning Methods for Discovering Proof Tactics in Automated Proofs (Work in Progress)ARW’11 24 / 35

Page 34: Machine Learning Methods for Discovering ... - macs.hw.ac.ukek19/STP-Dundee.pdf · Katya (Dundee) Machine Learning Methods for Discovering Proof Tactics in Automated Proofs (Work

Pattern-recognition in Coinductive Proof trees

Representation in vectors

list nat • �cons(x, cons(y, x)) - |cons(x, cons(y, x))| 0 2 0

cons(y, x)) - |cons(y, x))| 0 2 0

x -1 -1 1 0

y -1 0 1 0

z 0 0 0 0

And then it is further flattened into a vector.

Katya (Dundee) Machine Learning Methods for Discovering Proof Tactics in Automated Proofs (Work in Progress)ARW’11 25 / 35

Page 35: Machine Learning Methods for Discovering ... - macs.hw.ac.ukek19/STP-Dundee.pdf · Katya (Dundee) Machine Learning Methods for Discovering Proof Tactics in Automated Proofs (Work

Pattern-recognition in Coinductive Proof trees

A matrix of such vectors forms an input to the neuralnetwork:

It’s a two-layer feed-forward network, with sigmoid hidden and outputneurons, that can classify vectors arbitrarily well, given enough neurons inits hidden layer.The network was trained with scaled conjugate gradient back-propagation.

Katya (Dundee) Machine Learning Methods for Discovering Proof Tactics in Automated Proofs (Work in Progress)ARW’11 26 / 35

Page 36: Machine Learning Methods for Discovering ... - macs.hw.ac.ukek19/STP-Dundee.pdf · Katya (Dundee) Machine Learning Methods for Discovering Proof Tactics in Automated Proofs (Work

Pattern-recognition in Coinductive Proof trees

More examples coinductive trees:

Correct

list(cons(s(z), cons(s(z), s(z))))

nat(s(z))

nat(z)

list(cons(s(z), s(z)))

nat(s(z))

nat(z)

list(s(z))

Incorrect

list(cons(s(z), cons(s(z), s(z))))

nat(s(z))

nat(z)

list(cons(s(z), s(z)))

nat(s(z))

nat(z)

list(s(z))

Katya (Dundee) Machine Learning Methods for Discovering Proof Tactics in Automated Proofs (Work in Progress)ARW’11 27 / 35

Page 37: Machine Learning Methods for Discovering ... - macs.hw.ac.ukek19/STP-Dundee.pdf · Katya (Dundee) Machine Learning Methods for Discovering Proof Tactics in Automated Proofs (Work

Pattern-recognition in Coinductive Proof trees

More examples coinductive trees:

Correct

list(cons(s(z), cons(s(z), s(z))))

nat(s(z))

nat(z)

list(cons(s(z), s(z)))

nat(s(z))

nat(z)

list(s(z))

Incorrect

list(cons(s(z), cons(s(z), s(z))))

nat(s(z))

nat(z)

list(cons(s(z), s(z)))

nat(s(z))

nat(z)

list(s(z))

Katya (Dundee) Machine Learning Methods for Discovering Proof Tactics in Automated Proofs (Work in Progress)ARW’11 27 / 35

Page 38: Machine Learning Methods for Discovering ... - macs.hw.ac.ukek19/STP-Dundee.pdf · Katya (Dundee) Machine Learning Methods for Discovering Proof Tactics in Automated Proofs (Work

Pattern-recognition in Coinductive Proof trees

More examples of coinductive trees:

Correct

list(cons(s(O), cons(s(O), nil)))

nat(s(O))

nat(O)

list(cons(s(O), nil))

nat(s(O))

nat(O)

list(nil)

Incorrect

list(cons(s(O), cons(s(O), nil)))

nat(s(0)) list(cons(s(O), nil))

nat(s(O))

nat(O)

list(nil)

Katya (Dundee) Machine Learning Methods for Discovering Proof Tactics in Automated Proofs (Work in Progress)ARW’11 28 / 35

Page 39: Machine Learning Methods for Discovering ... - macs.hw.ac.ukek19/STP-Dundee.pdf · Katya (Dundee) Machine Learning Methods for Discovering Proof Tactics in Automated Proofs (Work

Pattern-recognition in Coinductive Proof trees

More examples of coinductive trees:

Correct

list(cons(s(O), cons(s(O), nil)))

nat(s(O))

nat(O)

list(cons(s(O), nil))

nat(s(O))

nat(O)

list(nil)

Incorrect

list(cons(s(O), cons(s(O), nil)))

nat(s(0)) list(cons(s(O), nil))

nat(s(O))

nat(O)

list(nil)

Katya (Dundee) Machine Learning Methods for Discovering Proof Tactics in Automated Proofs (Work in Progress)ARW’11 28 / 35

Page 40: Machine Learning Methods for Discovering ... - macs.hw.ac.ukek19/STP-Dundee.pdf · Katya (Dundee) Machine Learning Methods for Discovering Proof Tactics in Automated Proofs (Work

Pattern-recognition in Coinductive Proof trees

Test examples: Is this a correct CPT?

nat(cons(s(O), cons(s(O), nil)))

Yes.

Katya (Dundee) Machine Learning Methods for Discovering Proof Tactics in Automated Proofs (Work in Progress)ARW’11 29 / 35

Page 41: Machine Learning Methods for Discovering ... - macs.hw.ac.ukek19/STP-Dundee.pdf · Katya (Dundee) Machine Learning Methods for Discovering Proof Tactics in Automated Proofs (Work

Pattern-recognition in Coinductive Proof trees

Test examples: Is this a correct CPT?

nat(cons(s(O), cons(s(O), nil)))

Yes.

Katya (Dundee) Machine Learning Methods for Discovering Proof Tactics in Automated Proofs (Work in Progress)ARW’11 29 / 35

Page 42: Machine Learning Methods for Discovering ... - macs.hw.ac.ukek19/STP-Dundee.pdf · Katya (Dundee) Machine Learning Methods for Discovering Proof Tactics in Automated Proofs (Work

Pattern-recognition in Coinductive Proof trees

Test examples: Is this a correct CPT?

list(cons(s(O), cons(s(O), nil)))

nat(s(0))

nat(O)

list(cons(s(O), nil))

nat(s(O))

nat(O)

list(nil)

No.

Katya (Dundee) Machine Learning Methods for Discovering Proof Tactics in Automated Proofs (Work in Progress)ARW’11 30 / 35

Page 43: Machine Learning Methods for Discovering ... - macs.hw.ac.ukek19/STP-Dundee.pdf · Katya (Dundee) Machine Learning Methods for Discovering Proof Tactics in Automated Proofs (Work

Pattern-recognition in Coinductive Proof trees

Test examples: Is this a correct CPT?

list(cons(s(O), cons(s(O), nil)))

nat(s(0))

nat(O)

list(cons(s(O), nil))

nat(s(O))

nat(O)

list(nil)

No.

Katya (Dundee) Machine Learning Methods for Discovering Proof Tactics in Automated Proofs (Work in Progress)ARW’11 30 / 35

Page 44: Machine Learning Methods for Discovering ... - macs.hw.ac.ukek19/STP-Dundee.pdf · Katya (Dundee) Machine Learning Methods for Discovering Proof Tactics in Automated Proofs (Work

Pattern-recognition in Coinductive Proof trees

Test examples: Is this a correct CPT?

list(cons(s(nil), cons(O, nil)))

nat(s(nil))

nat(nil)

list(cons(O, nil))

nat(O)

list(nil)

Yes.

Katya (Dundee) Machine Learning Methods for Discovering Proof Tactics in Automated Proofs (Work in Progress)ARW’11 31 / 35

Page 45: Machine Learning Methods for Discovering ... - macs.hw.ac.ukek19/STP-Dundee.pdf · Katya (Dundee) Machine Learning Methods for Discovering Proof Tactics in Automated Proofs (Work

Pattern-recognition in Coinductive Proof trees

Test examples: Is this a correct CPT?

list(cons(s(nil), cons(O, nil)))

nat(s(nil))

nat(nil)

list(cons(O, nil))

nat(O)

list(nil)

Yes.

Katya (Dundee) Machine Learning Methods for Discovering Proof Tactics in Automated Proofs (Work in Progress)ARW’11 31 / 35

Page 46: Machine Learning Methods for Discovering ... - macs.hw.ac.ukek19/STP-Dundee.pdf · Katya (Dundee) Machine Learning Methods for Discovering Proof Tactics in Automated Proofs (Work

Pattern-recognition in Coinductive Proof trees

Test examples: Is this a correct CPT?

list(cons(s(O), cons(s(O), x)))

nat(s(0))

nat(O)

list(cons(s(O), x))

nat(s(O))

nat(O)

list(x)

No.

Katya (Dundee) Machine Learning Methods for Discovering Proof Tactics in Automated Proofs (Work in Progress)ARW’11 32 / 35

Page 47: Machine Learning Methods for Discovering ... - macs.hw.ac.ukek19/STP-Dundee.pdf · Katya (Dundee) Machine Learning Methods for Discovering Proof Tactics in Automated Proofs (Work

Pattern-recognition in Coinductive Proof trees

Test examples: Is this a correct CPT?

list(cons(s(O), cons(s(O), x)))

nat(s(0))

nat(O)

list(cons(s(O), x))

nat(s(O))

nat(O)

list(x)

No.

Katya (Dundee) Machine Learning Methods for Discovering Proof Tactics in Automated Proofs (Work in Progress)ARW’11 32 / 35

Page 48: Machine Learning Methods for Discovering ... - macs.hw.ac.ukek19/STP-Dundee.pdf · Katya (Dundee) Machine Learning Methods for Discovering Proof Tactics in Automated Proofs (Work

Conclusions

Our Result Versus Neural Nets result

AI4FM results — audience ofcomputer scientists with goodbackground in logic

Training set: 30 %Testing examples: 50 %.

Neural Network

Training set: - roughly 82%.(in the demo - 94%)Testing examples: - 85 — 100%.

Katya (Dundee) Machine Learning Methods for Discovering Proof Tactics in Automated Proofs (Work in Progress)ARW’11 33 / 35

Page 49: Machine Learning Methods for Discovering ... - macs.hw.ac.ukek19/STP-Dundee.pdf · Katya (Dundee) Machine Learning Methods for Discovering Proof Tactics in Automated Proofs (Work

Conclusions

Our Result Versus Neural Nets result

AI4FM results — audience ofcomputer scientists with goodbackground in logic

Training set: 30 %Testing examples: 50 %.

Neural Network

Training set: - roughly 82%.(in the demo - 94%)Testing examples: - 85 — 100%.

Katya (Dundee) Machine Learning Methods for Discovering Proof Tactics in Automated Proofs (Work in Progress)ARW’11 33 / 35

Page 50: Machine Learning Methods for Discovering ... - macs.hw.ac.ukek19/STP-Dundee.pdf · Katya (Dundee) Machine Learning Methods for Discovering Proof Tactics in Automated Proofs (Work

Conclusions

Conclusions:

Many proofs, especially by (co-)induction, especially usingconstructors (like nil or cons), share some common structure (=follow some patterns in machine learning terms) that can be detectedusing statistical learning;

We have tried to machine learn concurrent (coalgebraic) algorithms— e.g. coinductive proof trees [Komendantskaya,Power CALCO’2011and CSL’11].

Coalgebraic derivations look more promising than traditionalsequential derivations.

We have learned *both* from positive and negative examples.(models of “Why?”)

Big goal: move on to tactics in ITPs.

Katya (Dundee) Machine Learning Methods for Discovering Proof Tactics in Automated Proofs (Work in Progress)ARW’11 34 / 35

Page 51: Machine Learning Methods for Discovering ... - macs.hw.ac.ukek19/STP-Dundee.pdf · Katya (Dundee) Machine Learning Methods for Discovering Proof Tactics in Automated Proofs (Work

Conclusions

Questions?(Please contact me [email protected] if they arise later!)

Katya (Dundee) Machine Learning Methods for Discovering Proof Tactics in Automated Proofs (Work in Progress)ARW’11 35 / 35