Automated Search of Functions and Synthesis of Code

Automated Search of Functions and Synthesis of Code

Bruno Miguel Carrajola Patrício

Thesis to obtain the Master of Science Degree in

Mathematics and Applications

Supervisor: Prof. José Félix Gomes da Costa

Examination Committee

Chairperson: Prof. Maria Cristina De Sales Viana Serôdio SernadasSupervisor: Prof. José Félix Gomes da CostaMember of the Committee: Prof. Maria Paula Antunes Abrantes Gouveia

December 2019

Acknowledgments

First and foremost, it is mandatory that I start by thanking my family, specially my parents. Without their

support, patience and sacrifice not one step of my academic path could have happened, let alone this

dissertation.

I would also like to thank my supervisor Prof. Jose Felix da Costa, who with his patience and constant

availability to help made this dissertation possible.

And to all my friends. Whether you have directly accompanied me along this academic journey and

shared with me blood, sweat and tears throughout the last five years (or even just a small part of it)

or you have not and had to deal with everything that I went through, most of the times without even

understanding a word about what I was saying, there are no words that can express how much I thank

you.

iii

Resumo

O processo de descoberta cientıfica pode ser explicado como um ciclo que comeca com a observacao

de factos que nos rodeiam, modela essas observacoes em teorias, faz previsoes a partir dessas teorias

e depois confronta essas previsoes com outras observacoes, reforcando ou refutando essas teorias.

Na maioria das vezes, o elo fraco desta cadeia de eventos e o passo de inducao feito a partir de

factos concretos para teorias genericas porque nem sempre e facil para os cientistas encontrarem

estas correlacoes. Neste trabalho propomos uma solucao para esse problema: e se fosse possıvel

automatizar este passo e permitir que os computadores o fizessem? Isto pode ser alcancado se as leis

empıricas que os cientistas tentam encontrar forem nao so computaveis mas tambem estruturalmente

simples. Exploramos esta afirmacao ao relacionar estas leis com o conjunto das funcoes primitivas

recursivas (e posteriormente com um seu subconjunto, as funcoes elementares), permitindo apresentar

cientistas automaticos relativamente simples que seriam uma resposta inicial para este problema e um

ponto de partida para uma solucao mais completa e seria para a automacao do processo de inferencia

dedutiva.

Palavras-chave: Descoberta Cientıfica, Leis Empıricas, Cientistas Automaticos, Funcoes

Primitivas Recursivas, Funcoes Elementares, Geracao de Codigo.

v

Abstract

The process of scientific discovery can be explained as a cycle that starts with observing facts that

surround us, models those observations into theories, makes predictions from those theories and then

confronts them with other observations, reinforcing or disproving those theories. Most of the times, the

weak link of this chain of events is the inductive step from concrete observed facts to general theories

because it is not always easy for scientists to find these correlations. In this work, we propose a solution

to that problem: what if we can automate that step and allow computers to do it? This can be achieved

if the empirical laws that scientists try to find are not only computable, but also structurally simple. We

explore this statement by relating these laws with the set of the primitive recursive functions (and later on

with a subset of it, the elementary functions), allowing us to present relatively simple automatic scientists

that would be an early response to this problem and a starting point into a more serious and complete

solution for the automation of the inductive inference process.

Keywords: Scientific Discovery, Empirical Laws, Automated Scientists, Primitive Recursive

Functions, Elementary Functions, Code Generation.

vii

Contents

Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii

Resumo . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . v

Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vii

List of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xi

List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiii

List of Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xv

Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xvii

1 Introduction 1

2 Learning Theory 7

2.1 Computability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

2.2 Scientific methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

3 The Search Procedure 21

3.1 Primitive Recursive Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

3.2 Notation for identification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

3.3 The search algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

3.4 A first enumeration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

3.5 An improved enumeration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

3.6 From description to code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37

4 A Restriction to E 41

4.1 Elementary functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41

4.2 Notation for representation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43

4.3 The search algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45

4.4 Enumeration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48

4.5 From description to code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50

5 Results 55

5.1 First algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56

5.2 Second algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61

5.3 Third algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66

ix

5.4 Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73

6 Conclusions 75

6.1 Achievements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75

6.2 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76

Bibliography 77

A Functions tested and summarized results 81

A.1 List of functions tested with the scientists . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81

A.2 Summarized results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82

B Implementation of the Algorithms 87

x

List of Tables

3.1 Primitive recursive functions and their corresponding descriptions . . . . . . . . . . . . . . 27

4.1 Elementary functions and their corresponding descriptions . . . . . . . . . . . . . . . . . . . 46

A.1 Summary of the experiments made by the scientist related to the first algorithm . . . . . . 83

A.2 Summary of the experiments made by the scientist related to the second algorithm . . . . 84

A.3 Summary of the experiments made by the scientist related to the third algorithm . . . . . . 85

xi

List of Figures

1.1 Common mathematical puzzle seen several times over social media . . . . . . . . . . . . . 2

1.2 Diagram representing the scientific discovery process . . . . . . . . . . . . . . . . . . . . . . 2

3.1 List of all the functions whose descriptions have the referred size . . . . . . . . . . . . . . . 32

3.2 Lists of descriptions with the referred size and arities . . . . . . . . . . . . . . . . . . . . . . 37

3.3 Code for function with description Z() . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39

3.4 Code for function with description S() . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39

3.5 Code for function with description P(3,1) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39

3.6 Code for function with description C(P(3,2),[P(1,1),S(),S()]) . . . . . . . . . . . . . . . 40

3.7 Code for function with description R(Z(),P(2,1)) . . . . . . . . . . . . . . . . . . . . . . . . 40

4.1 Lists of descriptions for elementary functions with the referred size and arities. . . . . . . . 51

4.2 Code for function with description EA() . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53

4.3 Code for function with description EM() . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53

4.4 Code for function with description ET() . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53

4.5 Code for function with description ED() . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54

4.6 Code for function with description EBS(ES()) . . . . . . . . . . . . . . . . . . . . . . . . . . . 54

4.7 Code for function with description EBP(ES()) . . . . . . . . . . . . . . . . . . . . . . . . . . . 54

5.1 Results for the identity function searched by the first scientist . . . . . . . . . . . . . . . . . 56

5.2 Results for the successor function after the projection of the first argument searched by

the first scientist . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57

5.3 Results for the second attempt for the addition function searched by the first scientist . . . 58

5.4 Results for the second attempt for the subtraction function searched by the first scientist . 58

5.5 Results for the repetition of the second attempt for the subtraction function searched by

the first scientist . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59

5.6 Results for the second attempt for the product function searched by the first scientist . . . 60

5.7 Results for the second attempt for the product function searched by the second scientist . 63

5.8 Results for the third attempt for the function f(x, y) = (x + y) .− 1 searched by the second

scientist . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64

5.9 Results of the fourth attempt for the function pred(x) = x .− 1 searched by the third scientist 67

xiii

5.10 Results for the second attempt for the function f(x, y) = (x + y)x searched by the third

scientist . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68

5.11 Results regarding Ohm’s Law searched by the third scientist . . . . . . . . . . . . . . . . . . 72

xiv

List of Algorithms

1 Procedure to construct a text for ψ ∈ SD out of a scientist for AEZ . . . . . . . . . . . . . . 12

2 Recursive operator Φ used to prove the separation between Ex⋆ and Bc . . . . . . . . . . 15

3 Recursive operator Φ used to prove the separation between Bcn and Bcn+1 . . . . . . . . 17

4 Function that for a given prefix σ for a function ψ and a value x ∈ N outputs the value that

the scientist thinks the function ψ has when applied to x. . . . . . . . . . . . . . . . . . . . . 18

5 Scientist that Ex-identifies PRIM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

6 Search algorithm for a primitive recursive function given the input/output values . . . . . . 29

7 Construction of a function list composed by functions with a given description size . . . . 30

8 Search algorithm for a primitive recursive function given the input/output values having

into account the arity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

9 Construction of a function list composed by functions with a given description size and arity 34

10 Function that indicates if a description is already in a list of descriptions . . . . . . . . . . . 35

11 Scientist that Ex-identifies E . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47

12 Search algorithm for an elementary function given the input/output values having into

account the arity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48

13 Construction of a function list composed by elementary functions with a given description

size and arity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49

xv

Notation

Notation for undefined

The concatenation operation..− The subtraction operation defined over the natural numbers, i.e. x .− y = max(x − y,0).ψ A recursive function.

χC The characteristic map of the set C.

χpC The characteristic functions of the set C.

P A computer program.

e The computer program with code e.

We The recursively enumerable set with code e.

R The set of recursive functions.

PRIM The set of the primitive recursive functions.

E The set of the elementary functions.

Pn,i The n-ary projection function that develops the i-the element of the input tuple.

φ An enumeration of the recursive functions.

φ(e), φe The recursive function with code e.

ρ An enumeration of the primitive recursive functions.

ρ(e), ρe The primitive recursive function with code e.

π An enumeration of the elementary functions.

π(e), πe The elementary function with code e.

T A text for a generic function ψ.

T (n), Tn The n + 1-th element of text T .

T [n] The n first elements of text T , i.e. T0, . . . , Tn−1.

T The set of all texts T .

σ A prefix of a text T .

σ(k), σk The k + 1-th element of the prefix σ.

content(σ) The set of pairs ⟨n,ψ(n)⟩ in σ.

σ A partial function that, for given n, returns m such that ⟨n,m⟩ is in prefix σ.

SEG The set of all prefixes for all texts.

INIT The set of the prefixes for texts in the canonical order.

M A scientist for functions.

xvii

=n A relation between two functions such that they differ in n points.

=⋆ A relation between two functions such that they differ in finitely many points.

Ex The class of Ex-identifiable sets of functions.

Exn The class of Exn-identifiable sets of functions.

Ex⋆ The class of Ex⋆-identifiable sets of functions.

Bc The class of Bc-identifiable sets of functions.

Bcn The class of Bcn-identifiable sets of functions.

Bc⋆ The class of Bc⋆-identifiable sets of functions.

m(i) ≤x Output of applying the program with code m to input i when it halts within x steps of

computation.

xviii

Chapter 1

Introduction

The process of scientific discovery has always been present at every scientific breakthrough, be it big

or small. It is this process that, having as base observable facts, allows mankind to express the laws

that govern nature into theories that explain them. These theories, when worked upon, can produce

predictions about what happens in the future, which then can be used to reinforce the theories with

which we began in the first place or refute them, giving space for different and more accurate theories to

appear. We then have a scientific discovery method that is a never ending cycle of inducing theories from

data, deducing predictions from theories and validating or denying said theories through the analysis of

the veracity of those predictions.

In objective terms, it is obvious that, even though this process is cyclical, it has a beginning: the

inductive step, which derives from identifying patterns in information collected from observations. Gen-

erally speaking, this induction process is constantly present in our daily life: for example, if our shirt

has a stain, we can only guess – and by guess, we mean infer – what that dirtiness might be through

its colour, texture and/or smell (and with that theory, predict the best way to clean it); and if our car

has a dent in the morning, unfortunately all we can do is imagine (i.e. infer) all the possible scenarios

that could have taken place over the night through the analysis of the dent. Sometimes, we are even

presented with some more “mathematical” situations. If we are surfing social media it is usual for us to

be presented with a mathematical puzzle like the one in Figure 1.1. The truth is that we observe this

image and we retain the following data: three apples together sum 30, one apple and eight bananas,

18, and four bananas subtracted by two coconuts has as result 2. This is what we are presented with

and what we are going to use to try to identify patterns in order to formulate our theory, which will be the

following: one apple equals to 10, one banana to 1 and one coconut to 1. With this theory, we perform

our prediction for the operation in the fourth line: one coconut plus one apple and three bananas sums

up to 14. In this case, since we have no more information, we cannot proceed to verify if our prediction

is correct (as a matter of fact, this problem can simply be reduced to a system of three equations with

three variables, to which mathematical rules tell us that there is only one valid solution. This does not

mean that this is the end; for example, if a new operation was added with more fruits, i.e. variables, we

would need to change our theory in order to include possible values for them).

1

Figure 1.1: Common mathematical puzzle seen several times over social media

Figure 1.2: Diagram representing the scientific discovery process

This apparently simple process has been followed by scientists in order to discover theories that

explain what surrounds us, both at microscopic and/or macroscopic level. To do so, they rely on a

very useful tool that allows them to write what they find in a very clear and objective way: Mathematics.

Mathematics is a universal language in which scientific theories are formalized in order to be understood

and worked upon. In fact, Mathematics and mathematical propositions do not need facts to be tested

or validated, pure reason suffices (see [22]). That property of Mathematics is its greatest advantage

from a scientific point of view: since it does not need facts, scientists can work it beyond the empirical

data observed and thus perform predictions that will once again fall upon the realm of the observable.

In Figure 1.2 we can see a scheme from [22] that illustrates this process. The goal of a scientist is then

to find a theory that will not change through the course of this cyclical process, i.e. a theory that will not

be refuted by any upcoming observable facts, only reinforced by them (this theory is often called Theory

of Everything or United Field Theory). This is obviously an extremely and maybe even impossible task

to perform, since it is probably not possible to observe every fact in the universe and, even for the facts

that we can observe, it is not obvious how some of them are related. This means that sometimes usually

the best a scientist can do is describe conjectural theories that at least explain a part of the observable

universe, in hopes that in the future more universal theories will be formalized. A good example for

this (seen in detail in [22]) are the Newton’s Laws of Physics: they are extremely useful to explain the

physical motion of macroscopic elements; moreover, from them we can deduce other more specific

theories, like Kepler’s Laws of planetary motion, Galileo’s Laws of Motion or the Law of Tides. However,

it cannot predict or explain the motion of light rays. This means that Newton’s Laws can be considered an

unchangeable theory for a scientific discovery process that worries only with the motion of macroscopic

bodies, but it is not a universally valid theory for Science.

2

Until recently, the scientific discovery process was more a question of philosophical reflection than of

scientific action: scientists did it without thinking about it, philosophers thought about it without putting

it in action. However, that changed a few years ago when some scientists started to study this process

with a scientific predisposition, trying to formalize it through a clear set of rules. They realized that if this

is possible, then scientific discovery could be algorithmically structured and then we would be able to

achieve scientific advances and find theories much faster while obtaining much more complex results. If

this normalization is possible, then we need to to understand if machines can learn these inherent pro-

cesses and use them to discover scientific laws themselves. However, this will only be possible if natural

laws are computable (as defended by Kelly in [21] and Szudzik in [38] with the computable universe hy-

pothesis) and, beyond that, are simple enough to be discovered by using these kinds of processes. This

would mean that these laws would have to be algorithmic themselves and, consequently, the expected

behaviour of the world, if not its exact behaviour, would also have to be algorithmic (even if infeasibly

computable). A brief discussion about this problem can be found in [6]. One example for this was made

by Gold in [16], where through the analysis of children learning a language, and in his own words, it

was presented a construction for a “precise model for the intuitive notion ’able to speak a language’ in

order to be able to investigate theoretically how it can be achieved artificially”. A more practical and

more recent example can be seen in [17], where the learning of arithmetic is studied, formalized and

then taught to a neural network.

In [25] we can observe the construction of considerably general systems that are capable of achieving

significant scientific discoveries — for example, the BACON programs. In this book, we can see that

mathematical expressions that translate some very important and ground breaking scientific laws were

learned using these protocols, like the ideal-gas law (that relates the pressure P and the volume V

of an ideal gas with n moles at temperature T in Kelvin — PV = nRT , where R is the ideal gases

constant that is the same for all ideal gases), the law of gravitation (which states that the gravitational

force between two objects is directly proportional to the product of the masses of both objects and

is inversely proportional to the square of the distance between them — F = Gm1m2

d2where G is the

gravitational constant) or Kepler’s law (stating that the cube of the distance of a planet to the sun is

inversely proportional to the square of the period of the planet’s orbit — D3/P 2 = k). This means that

these laws, presented to us as core laws to explain the theories that allow us to perceive the workings of

the universe, are actually algebraically simple enough to be learned by a program with simple heuristics.

In fact, these laws are of the algebraic form xaybzc ⋅ ⋅ ⋅ = const ∈ R, a, b, c, ⋅ ⋅ ⋅ ∈ Z or some sort of linear

combination of them. Even laws with trigonometric functions can be learned by these proceedings by

using the trigonometrical relations in a rectangle triangle (for example, Snell’s law of optics, that relates

the sines of the incidence and refraction angles θ1, θ2 with the indices of refraction n1, n2 of the two

media in question — n1 sin θ2 = n2 sin θ1 — was learned using the definition of sine that states that this

trigonometrical relation of an angle in a rectangle triangle is given by the length of the opposite leg over

the length of the hypotenuse).

In more recent years, there have been considerable advances in this field, being that one of the most

important is the development of robotic scientists ADAM and EVE. The first one was able to perform

3

a scientific discovery about genetic encoding in the yeast Saccharomyces cerevisiae entirely on itself,

including the formulation of a hypothesis and its subsequent verification, while the second one performs

experiments over chemical genetics and drug design, having already found a possible relation between

triclosan, a common toothpaste ingredient, and the fight against malaria (see [3], [23] and [37]). Another

idea that shows the advance of scientific discovery nowadays is the dream of a scientist that is able to

perform discoveries so advanced that they would be worthy of a Nobel Prize all by itself (see [24]).

In our dissertation, we propose ourselves to develop a scientist that can identify the mathematical

expressions that explain the natural laws that surround us and return said mathematical expression as

a computer program in Python language. In a first impression we would be tempted to think that the

ideal way to develop a scientist like this is to develop an algorithm that would work upon the entire class

of recursive functions, which would be real functions of real variable. However, we easily see that we

only need to consider the recursive natural functions, since real numbers are not measurable directly

in nature and it is possible to transpose rational data to natural data, at least in computable models

(see [38]). Moreover, we acknowledge the simplicity of natural laws (as seen above) which we use

to our advantage by developing a scientist only for the learning of primitive recursive functions (we do

this because we believe that the functions outside the class of the primitive recursive functions are too

complex to explain natural laws and also because it is not possible to explain the whole class of recursive

functions by a brute-force algorithm while it is for the primitive recursive ones, as seen in [9] and explored

further ahead in our work, simplifying the construction of our scientist).

To construct said scientist there is a need to learn as much as we can about Learning Theory (using

notions of computer science, since we will be working on the basis of the computable universe hypothe-

sis), so that we can understand the correct way for executing this construction and the possible learning

capabilities of the scientist.

Next, we need to study the primitive recursive functions to learn how we can enumerate them. To

do so, we will define them through their descriptions and use the number of symbols in each description

to do said enumeration. This is a very complex problem that in order to be addressed in a complete

way there would have to be an extensive work only dwelling in this subject (for example, the doctoral

thesis developed by Rogerio Reis about the enumeration of automata in [33]). This means that we will

only dwell on this problem just enough so that we can carry out the work we are proposing to do. The

last step regarding the work performed upon the primitive recursive function is to generate the code of

a function through its description. Since the functions with which we are working are only the primitive

recursive ones, we know that the function in question can be encoded by a program whose loops are

restricted to nested and/or sequential for-loops (see [30]).

Moreover, we will perform the same work done with the primitive recursive functions for a subset of

this class: the elementary functions. We do so, because we also believe that this subset of functions is

enough to express the natural laws we are trying to identify.

Regarding the outline of this dissertation, in Chapter 2 we have the theoretical knowledge about

Learning Theory we need to have in mind to construct the scientist; in Chapter 3 we perform the study

of the primitive recursive functions, present a form of identification, two forms for listing this set and,

4

for each one of them, a search procedure that will be the base of construction for a scientist; lastly, we

present a way of transforming a description of a primitive recursive function into code written in Python

language. Chapter 4 is reserved for the study of elementary functions, a method to describe them,

its enumeration, a search procedure constructed for finding these functions and a portrait of how to

transform one into a program written in Python language. Lastly, in Chapter 5 we present the results

achieved by testing our scientist and discuss them while in Chapter 6 is were we draw the conclusions

of our work.

5

Chapter 2

Learning Theory

We can define Learning Theory as “the study of systems that map evidence into hypotheses” [32]. The

main goal of Learning Theory relies on trying to find the circumstances under which these hypotheses

stabilize to an accurate representation of the environment from which the evidence is drawn, case in

which it is said that learning is successful.

The following concepts and definitions come from Osherson et al. [32], to whom the paper [16] had

a big influence. It is assumed that learning involves the following four concepts:

1. A learner, or scientist;

2. A subject to be learned;

3. An environment, in which the thing to be learned is exhibited to the learner;

4. The conjecture that occurs to the learner about the subject to be learned on the basis of the

environment.

A learning paradigm is a specification of these four concepts. This means that Learning Theory

can also be defined as the study of learning paradigms. One of these is the identification of recursive

functions by a scientist; in a more concrete way, the problem of understanding which recursive functions

can be identified by which scientists and under which conditions that identification is made. It is through

this learning paradigm that we will try to identify the empirical laws, what will be performed by using

the computationalist hypothesis (see Kelly in [21] and Case in [6]), where we assume that both the

empirical laws and scientific methods are recursive relations. This means that it is reasonable to accept

that a law written in standard fashion (i.e. as an algebraic expression) and a computer program are

interchangeable and so we can conclude that the identification of computable functions is a way to

identify those empirical laws. Thus, by understanding how to identify recursive functions we will be

understanding also how we can be able to discover the empirical laws. That problem will be solved

by attacking the computational limits of what is learnable by a scientist and the rigidity of the learning

criteria of said scientist within this paradigm.

7

2.1 Computability

Since our study will fall under the identification of (computable) functions, we first need to recall some

computability theory notions that can be found in [10], [11] and [35].

Generally we can encode any abstract objects into natural numbers through the concept of Godeliza-

tion.

Definition 2.1.1. Godelization

LetW be a set. A computable1 one to one total function g ∶W → N is called a Godelization if:

a) g(W) is a decidable set in N.

b) g−1 ∶ g(W)→W is also a computable total function.

A Godelization can be defined for sets, lists of numbers, finite graphs, etc., which means that it is

possible to provide as input all of these structures to a program P that receives only natural numbers as

input. Actually, it is even possible to provide to a program P a natural number n that is first decoded into

a number m, that is a code of another program P ′, and into a number k, such that P with input n returns

the output of P ′ for input k. It is thus possible to encode as a natural number a various panoply of objects

that can then be provided as input for a program that receives natural numbers as input, including tuples

of numbers. This means that we can use the unary functions as notation for the entire set of functions.

Furthermore, the programs that compute these functions, for a given input, can either halt at some point

or run forever, however the existence of a programH that receives a number n, decodes it into a program

P of code m and a k and outputs 1 if P halts with input k and 0 otherwise is not possible (undecidability

of the halting problem). This means that we cannot know if a program, for a given input, will run forever

or if it just needs time to terminate the computation, which makes it difficult to understand if a program

is defined for a partial or a total function, concepts we define below.

Definition 2.1.2. Partial Recursive Function

A partial recursive function ψ can be defined by its graph, i.e. the set of input/output pairs (n,ψ(n))such that if P is a program that computes ψ then for all values of n on which P halts, it returns ψ(n). For

the values in which P does not halt, we say ψ is undefined. If e is the code of the program P in question,

then ψ will be denoted as φe and e will denote P .

Definition 2.1.3. Domain of a partial recursive function

Let ψ be a partial recursive function, computable by a program P . Then the set of numbers n such

that there is a pair (n,ψ(n)) in the graph of ψ is called the domain of ψ. In other words, the domain of ψ

is the set of numbers to which P halts.

Definition 2.1.4. (Total) Recursive function

A function ψ is recursive if it has as domain the set of natural numbers N, i.e. is total; in other words,

the graph (n,ψ(n)) that defines ψ has an element for each value n ∈ N. If P is a program that computes

ψ then for all values of n, P halts and it returns ψ(n). The set of all recursive functions is denoted by R.

1In this definition, we understand the concept of computable function in the Church-Turing sense.

8

Definition 2.1.5. Characteristic map and characteristic function

Let C be a subset of N. Then the characteristic map of C, χC ∶ C → N, is defined as

χC(x) =⎧⎪⎪⎪⎪⎨⎪⎪⎪⎪⎩

1, x ∈ C

0, x ∉ C

The characteristic function of C, χpC ∶ C → N, is defined as

χpC(x) =⎧⎪⎪⎪⎪⎨⎪⎪⎪⎪⎩

1, x ∈ C

undefined, x ∉ C

Definition 2.1.6. Recursively Enumerable Set and Recursive Set

A set S is said to be recursively enumerable if there is a program P that computes its characteristic

function. A set S is said to be recursive if there exists a program P that computes its characteristic map.

In either case, if e is the code for P , then we can denote S as We.

There are a few situations where it will be useful to consider not only the unary programs for unary

functions but also the programs for functions ψ that explicitly receive two input numbers n,m ∈ N and

return the value for ψ(n,m). For those situations, the following theorems are of special importance.

Theorem 2.1.1. s − 1 − 1 theorem for binary functions

For any fixed value m ∈ N, there is a computable total function g such that ψ(m,n) = φg(m)(n). This

means that for any arbitrary m, g(m) is the code of ψ(m,n).

Theorem 2.1.2. Kleene’s Theorem

For any binary partial recursive function ψ there is a number e ∈ N such that e(x) = ψ(e, x). In other

words, φe(x) = ψ(e, x).

2.2 Scientific methods

We will now begin defining important concepts of Learning Theory, present in [9], [11], [16] and [32].

Definition 2.2.1. Text for a function

A text T for a function ψ is a total function T ∶ N→ N2 such that for every a, b ∈ N, (a, b) ∈ range(T )⇔ψ(a) = b.

The set of all the texts T for functions is denoted as T . Tn denotes the pair T (n), while T [n] denotes

the sequence of pairs T0 . . . Tn−1. A text allows repetitions and is sensible to the order of its pairs, which

means that there is an uncountable number of texts for a function ψ. A text is thus a function whose

domain is important to give an order to the pairs contained in its range.

Definition 2.2.2. Text in canonical form

Let T be a text for a function ψ. T is said to be in the canonical form if T (i) = (i, ψ(i)) for any i ∈ N.

9

Definition 2.2.3. SEG = T [n] ∶ T ∈ T , n ∈ N is called the set of prefixes of recursive functions.

INIT ⊂ SEG is the subset of prefixes of texts in canonical form.

Let σ be an element of SEG. Then, content(σ) provides the set of pairs in σ. This sequence can be

seen as a partial function, denoted by σ and defined as

σ(m) =⎧⎪⎪⎪⎪⎨⎪⎪⎪⎪⎩

n if (m,n) ∈ content(σ)

otherwise

Since σ is a prefix of a function then it is not possible to have in content(σ) two pairs such as (m,n1)and (m,n2) with n1 ≠ n2 because any text for a function cannot have such pairs in its range, and so the

partial function σ is well defined.

Definition 2.2.4. Scientist

A scientist for functionsM is on itself a function such thatM ∶ SEG→ N.

The essential feature of a scientist is that it turns finite information into a hypothesis that covers

infinite values.

Definition 2.2.5. Convergence of scientist

A scientist for functionsM converges to i ∈ N on text T for a function if there exists p ∈ N such that

for t > p,M(T [t]) = i.

Definition 2.2.6. Ex-identification of functions

A scientist for functionsM Ex-identifies a function ψ for a text T if it converges on a conjecture i ∈ Nsuch that φi = ψ when provided text T .

A scientist for functionsM Ex-identifies a function ψ if, for every text T for ψ provided, it converges

for conjectures that are code for ψ, i.e. for any T for ψ the conjecture returned i ∈ N, which can differ

depending on T , is one such that φi = ψ.

A scientist for functionsM Ex-identifies a set of functions Ψ ifM Ex-identifies every function ψ ∈ Ψ.

The class of all the sets of recursive functions Ex-identifiable by a scientist is denoted Ex.

Remark: Ex comes from Explaining

Lets see that the class Ex is not empty. We define the set AEZ (Almost Everywhere Zero) as the

set of total recursive functions that take the value 0 for all but finitely many values and the set SD (Self-

Describing) as the set of recursive functions ψ such that ψ(0) is the code of a program P for ψ. Both

these sets are Ex-identifiable:

• For AEZ we build a scientist such that on input a prefix σ of a text for a function ψ it builds a

ordered list µ of non-zero values in σ and outputs the code of the function that receives as input

a natural x and executes the instruction If x ∈ dom(µ) Then µ(x) Else 0. Since a function in

AEZ is zero for all but finitely many points, then for a sufficient large prefix σ all the values of ψ

that are not in µ will be 0 and so the code outputed by the scientist will be a code for ψ. Thus

AEZ ∈ Ex.

10

• For SD we build a scientist that searches for the pair (0, ψ(0)) and outputs the value ψ(0) that by

definition is a code for ψ. For a sufficient large prefix, that value will be in the prefix, and so the

scientist will, from a point on, return a code for ψ for certain. Thus SD ∈ Ex.

Definition 2.2.7. Total scientist

A scientist M is total on a recursive function ψ if it provides conjectures for any prefix of any text

regarding ψ given to the scientist.

M is total on a set of functions Ψ if it is total on every function ψ ∈ Ψ.

M is total if it is total on the whole set of recursive functions R.

Proposition 2.2.1. (see [32]) For each scientist M for functions, there exists another scientist N for

functions, algorithmically obtainable from M, such that N is total and if M identifies any recursive

function ψ then so does N .

Proposition 2.2.2. (see [32]) Let M be a method that Ex-identifies the recursive function ψ. If Mconverges to a conjecture e on the canonical text for ψ, then there exists a scientistM′ that converges

to e on all texts for ψ.

The last two propositions are of special relevance because they mean that it is possible to go from any

scientist to one that identifies the same set of functions and is canonical and total, i.e. the achievements

of a scientist that does not need to receive texts in canonical order are the same as the ones who do,

which allows us to introduce concepts and results constructed only by presenting texts in canonical

order. Furthermore, being able to only use texts in canonical order means that, from now on, every

time we write about a text T for ψ we can simplify notation and write only ψ, since the T in question

is canonical and the order of the elements of a text in canonical order coincides with the order of the

values of function ψ.

We have been presenting definitions and results regarding scientists for natural functions of natural

variable. However, the experimental measures hardly are of said nature. However, by rescaling and

encoding the values of physical magnitudes, we can define physical laws as relations between natural

numbers. In a world of experimental error, convergence can be addressed with a different definition:

Definition 2.2.8. We say that the scientistM identifies ψ ∈R if there exists an e ∈ N and numbers p, l ∈ Nsuch that, for t ≥ p, M(ψ[t]) = e and, for all t ∈ N, ∣φe(t) − ψ(t)∣ ≤ 2−l, where the line over the functions

means the decoding of natural numbers into rational numbers. 2

Even though our motivation is the identification of empirical laws, that are mainly not natural functions,

we will develop our work by assuming natural values for empirical observations, for simplification.

Another important result comes from the Nonunion Theorem, explored below.

Theorem 2.2.1. (see [4]) Nonunion Theorem

If a scientist for functions M Ex-identifies a set of functions E1 and another scientist for functions

M′ Ex-identifies a set of functions E2 then it is possible but not certain that there exists a scientist that

Ex-identifies the set E1 ∪ E2. In other words, the class Ex is not closed under union.2In the standard context of learning theory, we take l = +∞, and we haveM converging to e on ψ[t] and, for all t ∈ N, φe = ψ.

11

Lets see an example for two classes of functions that are Ex-identifiable but are such that its union

is not. We already saw before that both AEZ and SD are in Ex. To observe that AEZ ∪ SD ∉ Exwe will show that if a scientist Ex-identifies AEZ then it cannot Ex-identify SD. LetM be an arbitrary

scientist that Ex-identifies AEZ and ψ a function of SD such that ψ(n) is either 0 or 1, for n > 0. For

any sequence of (canonical) observations σ of ψ, there exists always a lexicographically strictly longer

extension τ of σ such that M conjectures differently on τ and σ. We know this happens because

otherwise it would not be possible for the scientist M to distinguish between σ (n,0) and σ (n,1)3

although they are prefixes of texts for different recursive functions. To construct a text for this function ψ,

we observe the procedure present in Algorithm 1. By Kleene’s Theorem we know that there is a value

p ∈ N such that Γ(p, x) = φp(x), with p = ψ(0) by construction of σ. We know as well that Γ(e, x) = ψ(x)since σ is a partial subfunction of ψ, also by construction. If we provide the value p to Γ, then we have

that Γ(p, x) = φp(x) = ψ(x) and so ψ belongs to SD and M cannot Ex-identify limσ = ψ. Since Mis an arbitrary scientist for AEZ, then there is not a scientist for AEZ that Ex-identifies SD. Thus

AEZ ∪ SD ∉ Ex.

Algorithm 1 Procedure to construct a text for ψ ∈ SD out of a scientist for AEZ

function Γ(e, x ∶ N)∶ Nσ ∶= (0, e)for i ∶= 1 to +∞ do

τ0 = σ (i,0)τ1 = σ (i,1)ifM(σ) ≠M(τ0) then

σ ∶= τ0else

σ ∶= τ1end ifif x ≤ i then

return σ(x)end if

end forend function

The following proposition is a corollary of the previous result.

Proposition 2.2.3. R ∉ Ex.

Therefore, there is no scientist that Ex-identifies all the recursive functions. However, if we weaken

the identification criterion we will be able to capture larger collections of functions, eventually arriving to

R. We will start by allowing the occurrence of anomalies in the identification of functions.

Definition 2.2.9. n-variant

A partial recursive function ξ is an n-variant of a function ψ ∈ R if it coincides with ψ in all but finitely

many points never exceeding n. We write ψ =n ξ.

Definition 2.2.10. ⋆-variant

A partial recursive function ξ is a ⋆-variant of a function ψ ∈ R if it coincides with ψ in all but finitely

many points. We write ψ =⋆ ξ.3 is the symbol for the concatenation operation

12

Definition 2.2.11. Exn-identification

A scientistM Exn-identifies a function ψ ∈ R if there exists an order p ∈ N such that, for every t ≥ p,M(ψ[t]) = e and we have φe =n ψ. A set of functions is said to be Exn-identifiable if it exists a scientist

that Exn-identifies every function in that set.

Definition 2.2.12. Ex⋆-identification

A scientistM Ex⋆-identifies a function ψ ∈ R if there exists an order p ∈ N such that, for every t ≥ p,M(ψ[t]) = e and we have φe =⋆ ψ. A set of functions is said to be Ex⋆-identifiable if it exists a scientist

that Ex⋆-identifies every function in that set.

Definition 2.2.13. Exn and Ex⋆ are the corresponding classes of sets that are Exn- or Ex⋆-identifiable.

Remark: Ex0 = Ex

We will now see that the following sets of functions are in the previously defined classes Exn and

Ex⋆:

Definition 2.2.14. ASDn, for n ∈ N, is the set of all ψ ∈ R such that φψ(0) =n ψ, i.e. ψ(0) is an index of

an n-variant of ψ. Respectively, ASD⋆ is the set of all ψ ∈ R such that φψ(0) =⋆ ψ, i.e. ψ(0) is an index

of an ⋆-variant of ψ.

It is easily observed that for any n ∈ N, ASDn ∈ Exn: we construct a scientist that outputs 0 until

it finds the value ψ(0) at which point he outputs it from that moment on (in fact, since the scientist is

defined for texts in canonical order, this value will be the first in the text provided). Using the same

reasoning, it follows that ASD⋆ ∈ Ex⋆.

However, it is possible to show that ASDn+1 is not in Exn. Lets suppose that ASDn+1 is, in fact,

Exn-identifiable. Then there is a scientist that Exn-identifies all the functions in ASDn+1. LetM be an

arbitrary scientist in that condition. The proof is made by providing a function in ASDn+1 such that it

is possible to extend a prefix σ for that function by concatenating suitable segments τ ∈ SEG that will

make the scientist change his mind every time we make this process, i.e.,M(σ) ≠M(σ τ). Then, by

Kleene’s Theorem, we have that the scientist that supposedly Exn-identifies ASDn+1 does not converge

when presented the text limσ, which is a text for a function in ASDn+1. In the case of not being possible

to extend the prefix in a way such that the scientist keeps changing his mind, then the construction is

made in a way that prevents M from distinguishing between n + 1 different functions in ASDn+1. This

proof can be seen in detail in [7, 8]) and [10].

This means that ASDn+1 ∈ (Exn+1 ∖Exn). It also allows us to infer that ASD⋆ ∈ (Ex⋆ ∖⋃n∈NExn)because if not there would be an n ∈ N such that ASD⋆ ∈ Exn. By definition, we know that for any k ∈ N,

ASDk ∈ ASD⋆ and so we would have that ASDn+1 ∈ Exn, which we already seen cannot happen. We

can then make the following statement regarding the hierarchy of the classes Exn:

Proposition 2.2.4. (Case and Smith [7, 8]) Ex ⊂ Ex1 ⊂ Ex2 ⊂ ⋅ ⋅ ⋅ ⊂ Exn ⊂ Exn+1 ⊂ ⋅ ⋅ ⋅ ⊂ Ex⋆.

This means that the cognitive power of a scientist enhances by increasing the number of errors a

scientist is allowed to make, as long as they are finite. To understand that R is not in Ex⋆, we need

13

to relax the identification criterion furthermore by not demanding the canonical scientist to converge to

a single conjecture from a certain order on but to be able to change that conjecture, provided that the

scientist’s outputs are always appropriate ones. We will later see what appropriate means in this case.

This form of identification is called Bc-identification:

Definition 2.2.15. Bc-identification

A scientist for functionsM Bc-identifies a recursive function ψ ∈R if there exists an order p ∈ N such

that, for any t ≥ p we have that φM(ψ[t]) = ψ.

A scientist for functionsM Bc-identifies a set of functions Ψ ifM Bc-identifies every function ψ ∈ Ψ.

The class of all the sets of functions that are Bc-identifiable is denoted by Bc.

Remark: Bc comes from Behaviourally Correct

The next step will be to show that the hierarchy of identification does not stop at Ex⋆ and continues

with the Bc class, i.e. every function that can be identified syntactically with finitely many errors can be

identified semantically without any error. To show that, let S ∈ Ex⋆, ψ ∈ S, M a generic scientist that

witnesses the inclusion of S in Ex⋆ and σ ∈ SEG a prefix for ψ. The idea is to build a scientistM′ such

that for the input ofM,M′ simulatesM obtaining a certain code e =M(σ) and constructing a looking

up ordered list of pairs µ of the different elements (i, σ(i)) in σ. With these elements, it outputs the code

of a function that uses said list µ in the following way: given input x ∈ N, it checks if x is the first element

of a pair in µ. If it is, then it outputs the second element of that pair; if it is not, then it outputs the result

of the program e applied to element x, i.e. e(x) where e is the code returned from scientistM with

input σ. The code of this function returned by M′ is not necessarily the same because it depends on

the constructed set µ which depends on σ. Thus, as the scientist reads new information on σ the code

returned byM′ will necessarily change but it is always a code for ψ from some order on, and so ψ ∈ Bcand consequently S ∈ Bc. So, we have the following proposition:

Proposition 2.2.5. Ex⋆ ⊆ Bc.

The next step is to show that Bc ∖Ex⋆ ≠ ∅. For us to be able to understand the separation proof of

the classes Ex⋆ an Bc we need to introduce the concept of operator. Let F be the class of all partial

functions such that f ∈ F is of the type f ∶ N → N. Then an operator is a total function Φ ∶ F → F . From

this point on, σ ∈ F represents a finite function (i.e. a function with finite domain) and let σ be the natural

number encoding of the function σ. Then we say that an operator Φ is recursive if there is a binary partial

recursive function δ such that for every function ψ ∈ F , for all x, y ∈ N we have Φ(ψ)(x) = y if and only if

there is a finite function σ such that σ is a subfunction of ψ (i.e. the graph of σ is included in the graph of

ψ) and δ(σ, x) = y. With these in mind, we can state the following proposition:

Proposition 2.2.6. If Φ ∶ R → R is a recursive operator, then there is a recursive monotone increasing

function h ∶ N→ N such that for all n,x ∈ N, we have φh(n)(x) = Φ(h)((n,x)).

For the separation proof we will also need to define the following set of functions:

Definition 2.2.16. S is the set of all ψ ∈ R such that for all but finitely many i ∈ N, φψ(i) = ψ, i.e. for all

but finitely many i ∈ N, ψ(i) is an index of ψ.

14

It is obvious to see that S ∈ Bc. We just need to consider the scientist that outputs the last value that

it sees thus far while reading the input. By definition, from a certain point on that value will be the code

of the function to which the prefix is for, and thus any function in S can be Bc-identified by this scientist.

To show that the inclusive relation between the classes is a strict one, lets consider the recursive

operator defined in Algorithm 2. LetM be a scientist that witnesses the Ex⋆-identification of S. Also, let

hk denote h(k). We will use the operator in question to show that there are functions in S that cannot

be Ex⋆-identified byM.

Algorithm 2 Recursive operator Φ used to prove the separation between Ex⋆ and Bc

function Φ(h ∶ N→ N; (k, x) ∶ N) ∶ Nvar σ ∶ N→ SEG; m,y, s ∶ N

y ∶= 0σ0 ∶= (0, h0);if (k, x) = 0 then return h0;end iffor m ∶= 0 to +∞ do

σ2m+1 ∶= σ0;σ2m+2 ∶= σ0;whileM(σ2m+1) =M(σ0) andM(σ2m+2) =M(σ0) do

y ∶= y + 1;σ2m+1 ∶= σ2m+1 (y, h2m+1);σ2m+2 ∶= σ2m+2 (y, h2m+2);if k ∈ 2m + 1,2m + 2 and (k, x) = y then return hkend if

end whileifM(σ2m+1) ≠M(σ0) then σ0 ∶= σ2m+1else σ0 ∶= σ2m+2;end iffor s ∶= 1 to 2m do

σs ∶= σs (∣σs∣, σ0(∣σs∣)) ⋯ (y, σ0(y));end forif k ≤ 2m and (k, x) ≤ y then return σk((k, x))end if

end forend function

Lets consider the function h that Proposition 2.2.6 states it exists. By applying this function h to the

algorithm we can conclude some results.

First, all the sequences σ0, σ1, . . . , σ2m, σ2m+1, σ2m+2 are prefixes of potential graphs of total functions.

This is true due to the fact that, if the while loop halts for every m, then the domain of each one of the

functions σk,1 ≤ k ≤ 2m is updated in the internal for-loop in order to follow the values of σ0; in the limit,

these functions would be total. Then, we have to consider two cases:

1. The while guard fails to be true once for every m. This means that, for every k ∈ N, lim σk is a total

function. Moreover, and since φhk(x) = Φ(h)((k, x)) for every k ∈ N, hk is an index of h in all but

finitely many points. This happens because for i, j ∈ N such that i ≠ j either φhi and φhj coincide

in all points or differ only in a finite number of points. The first case happens when σ0 follows some

σk, for some k even or odd, until some order l, and then σk follows σ0; thus, σk and σ0 coincide in

every point. The second case happens because when σ0 follows a σk for a certain k, let us say

15

an odd k without loss of generality, until an order l, it does not follow σk+1, and so both σ0 and σk

differ of σk+1 up to order l; however, from order l onwards both σk and σk+1 follow σ0, and so the

difference between σ0 and σk+1 and σk and σk+1 will be happening only in finitely many points. In

conclusion, we have that φhk∈ S for every k ∈ N. This means that the scientistM should be able

to converge to a single code on φh0(x) for sufficient large x, which would mean that the scientist

should only change its mind finitely many times. However this does not happen since the guard

of the While cycle is constantly failing and the prefixes are updated on the following If clauses. In

summary,M fails to Ex⋆-identify a function in S.

2. The while loop does not terminate for a certain value m. Then both lim σ2m+1 and lim σ2m+2

are total functions. According to Proposition 2.2.6, we have that φh(2m+1)(x) = h(2m + 1) and

φh(2m+2)(x) = h(2m + 2), for every x ≥ x0 and for some order x0, and so by definition we have

that φh(2m+1), φh(2m+2) ∈ S. However, they will not be distinguished by the scientist M, since

M(σ2m+1) = M(σ0) = M(σ2m+2). This equality can be achieved for a value of m as big as we

want it to be, and thusM is a scientist that does not distinguish between two functions of S.

We then conclude thatM cannot Ex⋆-identify the set S, and so S ∉ Ex⋆ and consequently R ∉ Ex⋆.

This means that we still haven’t reached an identification class that contains R, which obliges us to

develop the Bc hierarchy even further to reach R. This can be achieved by joining the two ways the

identification criterion is weakened:

Definition 2.2.17. Bcn-identification

A scientistM Bcn-identifies a recursive function ψ ∈R if there exists an order p ∈ N such that, for any

t ≥ p, we have that φM(ψ[t]) =n ψ, i.e. from a certain order the scientist outputs a code for an n-variant

of ψ, but not necessarily the same one.

Definition 2.2.18. Bc⋆-identification

A scientistM Bc⋆-identifies a recursive function ψ ∈R if there exists an order p ∈ N such that, for any

t ≥ p we have that φM(ψ[t]) =⋆ ψ, i.e. from a certain order the scientist outputs a code for a ⋆-variant of

ψ, but not necessarily the same one.

Definition 2.2.19. Bc,Bcn and Bc⋆ are the classes of sets that are Bc-, Bcn- or Bc⋆-identifiable, re-

spectively.

To show that there exists a hierarchy between these classes of sets just like in the Exn classes, we

will first define the following sets of functions:

Definition 2.2.20. The set Sn with n ∈ N is the set of functions ψ ∈ R such that for all but finitely many

i ∈ N, φψ(i) =n ψ, i.e. for all but finitely many i ∈ N, ψ(i) is the code of a n-variant of ψ.

The set S⋆ with n ∈ N is the set of functions ψ ∈ R such that for all but finitely many i ∈ N, φψ(i) =⋆ ψ,

i.e. for all but finitely many i ∈ N, ψ(i) is the code of a ⋆-variant of ψ.

Just like in observing that S ∈ Bc, the same reasoning can be applied to show that Sn ∈ Bcn; we

just need to consider the scientist that outputs the last value read in the input. For showing that the set

16

Sn+1 is not Bcn-identifiable, we will make use of the operator defined in Algorithm 3 to show that there

are functions in Sn+1 that a (total) scientistM that witnesses the Bcn-identification of functions cannot

identify. Let Ln+3(y) = (q, `0, . . . , `n, z) ∈ Nn+3 ∶ y < q < `0 < ⋯ < `n < z, a set of (n + 3)-ordered tuples of

positive integers. The value of q refers to a step extension of the domain of the functions in construction;

the values `0, ..., `n are tentative points of convergence of some function of codeM(σj), for σj ∈ SEG;

finally the value z is the number of steps of computation of program codeM(σj) allowed in the current

tentative of convergence on the inputs `0, ..., `n.

Algorithm 3 Recursive operator Φ used to prove the separation between Bcn and Bcn+1

function Φ(h ∶ N→ N; (k, x) ∶ N)∶ Nvar σ ∶ N→ SEG; i, j, q, y, y′, z, s, `0, . . . `n ∶ N;

y ∶= 0;σ0 ∶= ε;for [1] j ∶= 0 to +∞ do

for [2] (q, `0, . . . , `n, z) ∈ Ln+3(y) in lexicographical order doσj ∶= σ0 (y + 1, hj) ⋯ (y + q, hj);if [1] j = k and y + 1 ≤ (k, x) ≤ y + q then return hj ;

if [2] M(σj)(`0)z and ... and M(σj)(`n)z then exit for[2]end if[2]

end if[1]end for[2]y′ ∶= maxy + q, `n;σj ∶= σj (y + q + 1, hj) ⋯ (y′, hj);for [3] i ∶ y < i ≤ y′ do

if [3] i ∉ `0, . . . , `n then σ0(i) ∶= σj(i)else if φM(σj)(i) ≠ h0 then σ0(i) ∶= h0else σ0(i) ∶= σj(i);end if[3]

end for[3]for [4]s ∶= 1 to j − 1 do

σs ∶= σs (y + 1, σ0(y + 1)) ⋯ (y′, σ0(y′));end for[4]if [4] (k, x) ≤ y′ and k < j then return σk((k, x));end if[4]y ∶= y′;

end for[1]end function

By providing as input to the operator the function h whose existence is guaranteed by Proposition

2.2.6, the recursive operator has some characteristics that allow us to conclude some results. In the

first place, σ0, σ1, ..., σj , ... ∈ SEG are all prefixes of graphs of functions. Whenever the for[2] loop is

interrupted each prefix σk (from k = 0 to k = j) is extended in the external for[1] loop in the following way:

the program codeM(σj) is executed z steps on inputs `0, ..., `n inside the for[2] loop; eventually, for

some tuple (q, `0, . . . , `n, z), the search is successful and the loop is interrupted. After all the successive

executions of the for[2] loop are computed, the domain of the function σ0 is extended in the for[3] loop,

from 0, . . . , y to 0, . . . ,maxy + q, `n. The for[4] loop makes all the functions σk defined so far (for

k = 1 to k = j − 1) to follow the values of σ0. Note that, whenever the for[2] loop is non-terminating at

the final step j, σ0, σ1, ..., σj−1 are graphs of functions with finite domain, but limσj will always be a text

for a total function. With this in mind we have two cases to consider:

17

1. If there is a value for j ∈ N such that the for[2] loop does not halt, we have that the function

φhk= h0 h1 . . . hj−1 hj hj hj ⋅ ⋅ ⋅ = lim σj will be a total function and φhk

∈ S ⊂ Sn+1. Moreover, the

scientistM is not able to Bcn-identify φhksince the conjecture it provides fails to converge on at

least n + 1 input values. Thus the scientist is not able to Bcn-identify a function in Sn+1.

2. If the for[2] loop in the algorithm always terminates for all j ∈ N. This means that for every j ∈ N,

lim σj is a total function. Furthermore, limσ0, which is constructed over the successive for[3]loops computed, is a text for the function Φ(h)((k, x)). In this case, all the values of the total

increasing function h described in Proposition 2.2.6 are codes of (n+ 1)-variants of h. This means

that for some order x0, for every x ≥ x0, M(limσ0[x]) provides codes hj of the function lim σ0

such that φhj = lim σj differs from h in n + 1 values and thus the scientist M cannot Bcn-identify

lim σ0 ∈ Sn+1, identifying instead lim σj .

We thus conclude that for any n ∈ N, Sn+1 ∉ Bcn. It follows that S⋆ ∈ (Bc⋆∖⋃n∈NBcn): if S⋆ ∈ ⋃n∈NBcn

then there would exist a value of n such that S⋆ ∈ Bcn and in particular we would have Sn ∈ Bcn and

Sn+1 ∈ Bcn, which is not possible. This allows us to conclude that R ∉ Bcn for all n ∈ N. We can also

make the following statement:

Proposition 2.2.7. (Case and Smith [7, 8]) Bc ⊂ Bc1 ⊂ Bc2 ⊂ ⋅ ⋅ ⋅ ⊂ Bcn ⊂ Bcn+1 ⊂ ⋅ ⋅ ⋅ ⊂ Bc⋆.

The same way as in Proposition 2.2.4, we see that as we increase the number of errors allowed

(maintaining them at a finite number) we enhance the learning capability of the scientists. The question

remaining to be answered is if the hierarchy ends at Bc⋆ and R ∈ Bc⋆. Lets consider a binary function f

that receives a prefix σ ∈ SEG for a function ψ and a value x ∈ N and outputs a conjecture for the value

of ψ(x). For that function, let m(i) ≤x denote the output resulting of applying the program with code

m to input i when it halts within x steps of computation; if it doesn’t halt, then it outputs .

Algorithm 4 Function that for a given prefix σ for a function ψ and a value x ∈ N outputs the value thatthe scientist thinks the function ψ has when applied to x.

function f (σ ∶ SEG;x ∶ N) ∶ Nfor m ∶= 0 to ∣σ∣ do

τ = ∅for i ∶= 0 to ∣σ∣ − 1 do

τ ∶= τ (i,m(i) ≤x)end forif τ = σ then return m(x)end if

end forreturn 0

end function

For a big enough value of ∣σ∣, it is certain that the code for the wanted ψ with prefix σ is between

0 and ∣σ∣. However, convergence is only guaranteed if we have a large enough value for x, because

otherwise we risk having situations for which m(i) ≤x does not halt. So, for large values of x and

∣σ∣, generally x >> ∣σ∣, it is possible to find a value m such that m(i) ≤x converges in every entry i

between 0 and ∣σ∣ − 1. This means that there exists an order p such that for x >> p, f(σ,x) converges to

18

ψ(x). By applying the s − 1 − 1 Theorem, we then have that there is a computable function s such that

f(σ,x) = φs(σ)(x) = ψ(x) for all x greater than p. This means that φs(σ) =⋆ ψ, i.e. s(σ) is a computable

code of a ⋆-variant of ψ. This code is not unique and may vary depending on the value of ∣σ∣: as ∣σ∣increases, the value for m such that the program m converges for all values of i in x steps may change

and so does the code s(σ). By constructing a scientistM that receives σ and outputs the value for s(σ),we have a scientist that Bc⋆-identifies ψ. Since this reasoning is valid for all ψ ∈ R, then all recursive

functions can be Bc⋆-identified.

Proposition 2.2.8. R ∈ Bc⋆.

We then reached the full power of identification by scientists. This means that the entire set of

recursive functionsR can only be identified by permitting a scientist to change its conjecture an unlimited

number of times and by allowing each conjecture to have finitely many errors.

In the next chapter, we will focus on a subclass of R, the primitive recursive functions, that we will

see is easier to identify and upon which we will develop a scientist.

19

Chapter 3

The Search Procedure

Now that we know that every recursive function can be identified (at least semantically and allowing

finitely many errors in its identification) we will return to the study of empirical laws to question ourselves

how far do we need to go to identify the expressions that represent these laws.

3.1 Primitive Recursive Functions

By analyzing the format of the empirical laws known to this date, we observe that it is extremely difficult

to conceive a natural law that cannot be represented by a primitive recursive function, a subclass of

the class of functions R (due to the fact that the examples given for functions that are recursive but

not primitive recursive, like the Ackermann function (see [12]), are extremely complex to define and,

in practise, have never been a subject of the study by the natural sciences, only by the theoretical

mathematical field of computability), so if we want to study the possibility of the natural laws to be

learned by a computational scientist we only need to focus our attention to this subclass of functions.

We now present a formal definition for the class of primitive recursive functions.

Definition 3.1.1. Primitive Recursive Functions

The primitive recursive functions are those inductively defined by the following rules:

1. The 0−ary constant function 0 is primitive recursive.

2. The 1-ary successor function S, defined by the expression S(x) = x + 1, is primitive recursive.

3. For any n ∈ N and for i ∈ N such that 1 ≤ i ≤ n, we have that the function Pn,i defined by the

expression Pn,i(x1, . . . , xn) = xi is primitive recursive.

4. Given a k−ary primitive recursive function f and k many m−ary primitive recursive functions

g1, . . . , gk, the function h resulting from the composition of these functions, defined by the ex-

pression h(x1, . . . , xm) = f(g1(x1, . . . , xm), . . . , gk(x1, . . . , xm)), is primitive recursive.

5. For a k−ary primitive recursive function f and a k+2−ary primitive recursive function g, the k+1−ary

21

function h defined as

h(x1, . . . , xk, y) =⎧⎪⎪⎪⎪⎨⎪⎪⎪⎪⎩

f(x1, . . . , xk), y = 0

g(x1, . . . , xk, y − 1, h(x1, . . . , xk, y − 1)), otherwise

is primitive recursive.

In fact, if we add a sixth rule to the ones in Definition 3.1.1 we can obtain the set R of the recursive

functions. That rule is the following:

6. Let f be a k + 1−ary recursive function. Thus, the k−ary function g defined by the expression

g(x1, . . . , xk) = µyf(x1, . . . , xk, y) = z if f(x1, . . . , xk, z) = 0 and for i < z, f(x1, . . . , xk, i) ≠ 0 is also

recursive.

There is another way we can identify the class of primitive recursive functions. To do so we first need

to understand how a program P in language X (whatever that language is) is developed. It is done by

encoding a sequence of instructions that can be simple assignments (for example x ∶= 0, x ∶= y, x ∶= y+1),

conditionals (if guard then ... else ...), for-loops (for i = 1, . . . , y do ... where i never resets) or

while cycles (while guard do ...). We call a program loop-program if it can be built using a sequence

of assignments, conditionals and for-loops. With these in mind, we present the following statement:

Theorem 3.1.1. (see [30] and [18]) The primitive recursive functions are exactly those computed by

loop-programs, i.e. the programs that can be written without while cycles.

The loop-programs built to compute a primitive recursive functions can use either sequenced and/or

nested for-loops. In fact, the number of nested for-loops is one way to measure the structural com-

plexity of a loop-program: by defining recursively the classes Ln such as L0 is the class of loop-free

straight line programs and Ln+1 the class of loop-programs in which every for-loop is of the form for

i = 1, . . . , y do P , where P ∈ Lm with m ≤ n, we obtain a hierarchy Ln ∶ n ∈ N of loop-programs. In [18]

we can see that this hierarchy allows us to understand the power of nested for-loops; in fact, the class

L2, i.e. the class of functions with at most two nested for-loops corresponds to the so-called class of

elementary functions. These functions can be described as the ones obtained by iteration of the opera-

tions of ordinary arithmetic, which means that although simple in terms of structural complexity (since it

is only needed at most two nested for-loops to compute them), they are a very important subset of the

primitive recursive functions. We can define this class as follows:

Definition 3.1.2. (see [12]) Elementary Functions

The set E of elementary functions is the smallest class such that:

1. the functions x + 1, Pn,i (1 ≤ i ≤ n), x .− y 1, x + y and xy are in E ;

2. E is closed under composition;

3. E is closed under the operations of forming bounded sums and bounded products (i.e. if f(x, y) is

in E then so are the functions ∑z<y f(x, z) and ∏x<y f(x, z)).1The operator .− corresponds to the subtraction for the natural numbers, i.e. we have that x .− y = maxx − y,0

22

3.2 Notation for identification

Now that we saw how to define a primitive recursive function, some notation will be introduced in order

to facilitate the identification of these functions. We will introduce the concept of description that can be

used to denote any recursive function.

Definition 3.2.1. Description

A description of a recursive function is an expression inductively defined by the following rules:

1. The symbol Z() is a 0-ary description that describes the constant 0.

2. The symbol S() is a 1-ary description that describes the successor, i.e., the 1-ary function with the

expression S(x) = x + 1.

3. The symbol P(n,i), for any n and i such that 1 ≤ i ≤ n is an n-ary description that describes the

i-th projection, i.e., the n-ary function defined by the expression Pn,i(x1, . . . , xn) = xi.

4. If G is a k-ary description, with k ≥ 0, that describes the function g and if H 1, . . . , H k are n-

ary descriptions that describe the functions h1, . . . , hk respectively, with n ≥ 0, then C(G,[H 1,

..., H k]) is an n-ary description that describes the n-ary function f defined by the expres-

sion f(x1, . . . , xn) = g(h1(x1, . . . , xn), . . . , hk(x1, . . . , xn)). We say that f is obtained from g and

h1, . . . , hk by composition.

5. If G is an n-ary description with n ≥ 0 that describes the function g and H is an (n + 2)-ary de-

scription that describes the function h then R(G,H) is an (n + 1)-ary description that describes the

(n + 1)-ary function f recursively defined by the expressions f(x1, . . . , xn,0) = g(x1, . . . , xn) and

f(x1, . . . , xn, y + 1) = h(x1, . . . , xn, y, f(x1, . . . , xn, y)). We say that f is obtained from g and h by

primitive recursion.

6. If G is an (n + 1)-ary description that describes the function g then M(G) is an n-ary description

that describes the function f defined by the expression f(x1, . . . , xn) = µy.g(x1, . . . , xn, y). This

function outputs the least value for y such that g(x1, . . . , xn, y) = 0 and for z < y, g(x1, . . . , xn, z) > 0.

We say that f is obtained from g by minimization.

Definition 3.2.2. The size of a description D is given by the number of occurrences of all the symbols Z,

S, P, C, R and M in D.

Each n-ary description describes a unique n-ary recursive function. However, several descrip-

tions can describe the same recursive function; for example if D describes a recursive function then

C(P(1,1),[D]) describes the exact same function.

In order to identify a primitive recursive function we can observe its description: a recursive function

is primitive recursive if it has a description built using the rules 1 to 5 defined in Definition 3.2.1; in

other words, a primitive recursive function is a recursive function that has a description defined by an

expression written without the symbol M.

23

We will now analyze a few basic primitive recursive functions, in order to deduce a description (with-

out the symbol M) for each one.

Example 3.2.1. First we will look into a simple function. Let zero ∶ N → N be the function defined by the

expression zero(x) = 0.The first step is to write zero recursively. This definition can be achieved through

the functions g ∶ N0 → N and h ∶ N2 → N defined by the respective expressions:

g ≡ 0 (3.1)

h(x, y) = y (3.2)

These functions are used to define zero recursively as follows: zero(0) = g ≡ 0 and zero(x + 1) =h(x, zero(x)) = zero(x). We can easily observe through equation (3.1) that g has as description the

expression Z() and, by equation (3.2), h has as description the expression P(2,2). Thus, using rule 5

of the Definition 3.2.1, we reach for the description of zero the expression R(Z(),P(2,2)).

Example 3.2.2. Let pred ∶ N→ N be the predecessor function. This function is defined by the expression

pred(x) = x .−1, where the operator .− corresponds to the operation of subtraction for the natural numbers.

In order to be easier to write the description for this function we can define it recursively by the functions

g ∶ N0 → N and h ∶ N2 → N, respectively defined by the expressions

g ≡ 0 (3.3)

h(x, y) = x (3.4)

We already know a description for equation (3.3), Z(). For equation (3.4) it is easily observed that h

is the projection function of the first element of the input pair and thus its description is given by P(2,1).

Consequently, the description for the function pred, which is recursively defined by the expressions

pred(0) = g ≡ 0 and pred(x + 1) = h(x, pred(x)) = x, is given by R(Z(),P(2,1)).

Example 3.2.3. We will show that the addition function add ∶ N2 → N defined by the expression

add(x, y) = x + y is primitive recursive. First, we need to define function f recursively, which can be

done by functions g ∶ N→ N and h ∶ N3 → N defined by the expressions

g(x) = x (3.5)

h(x, y, z) = z + 1 (3.6)

We can now write a description for both functions in equations (3.5) and (3.6):

• For equation (3.5), it is easily understood that the expression that describes the function g is

P(1,1).

• For equation (3.6), we can also see that h can be defined as the successor of the third element of

the triple given as input, and thus its description is the expression C(S(),[P(3,3)]).

24

By rule 5 in the Definition 3.2.1, we can safely assume a description for function add: since it is recur-

sively defined by add(x,0) = g(x) = x and add(x, y + 1) = h(x, y, add(x, y)) = add(x, y)+ 1 a description is

given by the expression R(P(1,1),C(S(),[P(3,3)])).

Example 3.2.4. We will now analyze the difference function natminus ∶ N2 → N defined by the ex-

pression natminus(x, y) = x .− y, where .− is the operation of subtraction for the natural numbers. Lets,

once again, write the expressions of the functions g ∶ N → N and h ∶ N3 → N that will be used to define

recursively natminus:

g(x) = x (3.7)

h(x, y, z) = z .− 1 (3.8)

We are now under conditions of writing a description for both functions in equations (3.7) and (3.8):

• A description for equation (3.7) is, obviously, P(1,1).

• By observation of the definition of h, it is obvious that this function is obtained by applying the

predecessor to the third element of its argument. Thus, an expression for a description of h is

C(R(Z(),P(2,1)),[P(3,3)]).

It is now easy, by using rule 5 once more, to reach a description for this function, which is given by

the expression R(P(1,1),C(R(Z(),P(2,1)),[P(3,3)])).

Example 3.2.5. Let prod ∶ N2 → N be the product function defined by the expression prod(x, y) = x × y.

The functions g ∶ N → N and h ∶ N3 → N that will define recursively prod are defined by the respective

following expressions:

g(x) = 0 (3.9)

h(x, y, z) = z + x (3.10)

Now we write a description for both functions in equations (3.9) and (3.10):

• For equation (3.9), we observe that the function g is the 1-ary zero function, to which we already

deduced an expression for its description in Example 3.2.1: R(Z(),P(2,2)).

• For equation (3.10), it is easy to observe that h is defined by the addition of the first and the third

elements in its argument. We already have description for the addition from Example 3.2.3. Thus,

we can easily obtain a description for h: C(R(P(1,1),C(S(),[P(3,3)])),[P(3,1),P(3,3)]).

By rule 5 in the Definition 3.1.1, and because we can define recursively prod using the expressions

prod(x,0) = x and prod(x, y + 1) = h(x, y, prod(x, y)) = prod(x, y) + x, we deduce a description for prod:

R(R(Z(),P(2,2)),C(R(P(1,1),C(S(),[P(3,3)])),[P(3,1),P(3,3)])).

Example 3.2.6. Finally, we define the distance function dist ∶ N2 → N as dist(x, y) = ∣x−y∣. To understand

how to write a description for this function, we will write it in a different way: dist(x, y) = (x .− y)+ (y .− x).

25

This means that we can define the distance as the addition of two subtractions. For us to be able to write

the description for this function we first need to deduce the description of the function d ∶ N2 → N defined

by the expression d(x, y) = y.− x, which is the difference function but with the arguments switched.

This means that a description for d is C(R(P(1,1),C(R(Z(),P(2,1)),[P(3,3)])),[P(2,2),P(2,1)]).

Then, by the rules in Definition 3.2.1, a description for the distance function will be the expression

C(R(P(1,1),C(S(),[P(3,3)])),[R(P(1,1),C(R(Z(),P(2,1)),[P(3,3)])),C(R(P(1,1),C(R(Z(),

P(2,1)),[P(3,3)])),[P(2,2),P(2,1)])]).

We present a table summarizing the previous functions and their respective descriptions (Table 3.1).

By remembering Theorem 3.1.1 from [30], this means that for all these functions there is a program

written with sequences of only assignments, if-else conditionals and sequential and/or nested for

loops that computes it.

3.3 The search algorithm

We now have a good definition of the primitive recursive functions and a proper notation to identify them,

so we can now proceed into implementing the search algorithm. However, before doing so, there is a

result we need to present relative to the enumeration of primitive recursive functions: the fact that said

enumeration is actually possible. The proof of that result is given in the following statement.

Proposition 3.3.1. The set of the primitive recursive functions (PRIM) is recursively enumerable.

Proof. Lets start by the notation used when defining descriptions. We can then define a correspondence

between those symbols and the set of natural numbers. Let e be that correspondence. We have the

following definition for e (from [1]):

• e(Z()) = ⟨0⟩.

• e(S()) = ⟨1⟩.

• e(P(n,i)) = ⟨2, n, i⟩.

• e(C(G,[H1, . . . ,Hk])) = ⟨3, k, l, e(G), e(H1), . . . , e(Hk)⟩, where l is the arity of the description.

• e(R(G,H)) = ⟨4, l, e(G), e(H)⟩, where l is the arity of the description.

To understand why to each description corresponds one natural number, we see that there exists a

bijective function that encodes a tuple of natural numbers into one and only one natural number: the

function τ ∶ ∪k>0Nk → N in [12] such that τ(a1, . . . , ak) = 2a1+2a1+a2+1+2a1+a2+a3+2+⋅ ⋅ ⋅+2a1+a2+⋅⋅⋅+ak+k−1−1.

However, it is not enough to perform the inverse correspondence to obtain an enumeration, since not

every natural number will be obtained from this e, i.e. this correspondence is injective but not bijective.

To resolve this problem we define that for every natural number not in the range of e, the output of the

enumeration is the constant function 0, i.e. the one denoted by Z(). This way, we have a well defined

and exhaustive enumeration for the primitive recursive functions.

26

Func

tion

Des

crip

tion

zero(x)=

0R(Z(),P(2,2)

pred(x

)=x

. −1

R(Z(),P(2,1)

add(x,y

)=x+y

R(P(1,1),C(S(),[P(3,3)]))

natminus(x,y

)=x

. −y

R(P(1,1),C(R(Z(),P(2,1)),[P(3,3)]))

prod(x,y

)=x×y

R(R(Z(),P(2,2)),C(R(P(1,1),C(S(),[P(3,3)])),[P(3,1),P(3,3)]))

dist(x,y

)=∣x−y∣

C(R(P(1,1),C(S(),[P(3,3)])),[R(P(1,1),C(R(Z(),P(2,1)),[P(3,3)])),C(R(P(1,1),C(R(Z(),P(2,1)),[P(3,3)])),[P(2,2),P(2,1)])])

Tabl

e3.

1:P

rimiti

vere

curs

ive

func

tions

and

thei

rcor

resp

ondi

ngde

scrip

tions

27

Proposition 3.3.2. PRIM ∈ Ex.

Proof. To prove that the primitive recursive functions are Ex-identifiable we simply present a scientist

that Ex-identifies this set of functions (Algorithm 5, in [10] and [9]). Let ρ ∶ N → PRIM be an enumer-

ation for the primitive recursive functions, ρi denote the function obtained from ρ(i) and ψ the primitive

recursive function explained by a text in the canonical form with prefix σ.

Algorithm 5 Scientist that Ex-identifies PRIMfunctionM(σ ∶ SEG)∶ N

i ∶= 0;k ∶= 0;n ∶= length of σwhile k < n do

if ρi(k) = ψ(k) then k ∶= k + 1else

k ∶= 0;i ∶= i + 1;

end ifend whilereturn i

end function

This algorithm proceeds in the following way: it searches for the least i ∈ N such that the output of

ρi applied to every k < n has the same value as the output of ψ applied to k (which is obtainable in σ).

Since ρ is an enumeration for the set PRIM, then this algorithm will not overlook any primitive recursive

function and since σ is a prefix of a text that explains a primitive recursive function, this algorithm will

eventually halt.

We will then focus our attention into the class of primitive recursive functions and into developing a

scientist that Ex-identifies said class.

Since we know that the enumeration of the primitive recursive functions is possible, we can base the

materialization of the search program on the fact that the primitive recursive functions can be enumer-

ated. We can observe that program in Algorithm 6 which we will proceed to explain.

Let Functions be a function that given a natural number i lists the descriptions with size i. Then

for a certain size, we list the descriptions with said size and, for each of these descriptions, we check

if the result of the function defined by the description applied to each element in the inp list is equal

to the correspondent element in out (both inp and out are provided as input to the search procedure

in Algorithm 6). If it is equal for all these elements, then we have found our function and we terminate

the program by returning that description. If one of the comparisons is not true, then we proceed to the

following description in the list and if no description verifies every comparison, we construct another list

whose descriptions have the size of the previous ones incremented by one. We thus have a defined

search algorithm whose computability depends on the implementation of the listing function Functions.

28

Algorithm 6 Search algorithm for a primitive recursive function given the input/output valuesInput: inp as the list of tuples with the input values; out as the list of integers with the corresponding

output valuesOutput: a description of a primitive recursive function

procedure SEARCH(inp,out)i ∶= 1for i ∈ N do:

for f in Functions(i) doif f .arity ! = length(inp[0] then)

t ∶= Falsecontinue

elset ∶= Truej ∶= 1for j < length(inp) do

t ∶= (f(inp[j]) == out[j])if t is False then

breakend if

end forif t is True then

res ∶= fbreak

end ifend if


breakend if

end forreturn res

end procedure

3.4 A first enumeration

An implementation for the function Functions, that given a size lists the descriptions of that size, and

the necessary auxiliary functions is present in Algorithm 7 (adapted from [13]):

• Functions is the main function. It receives the descriptions’ size and outputs a list with all the

possible descriptions with that size. It does so by, when the size is 1, yielding the descriptions Z(),

S() and the descriptions for all projections with arity up to 3. If the size is 0 it passes and then calls

two auxiliary functions to construct the rest of the descriptions: Compositions and Recursions

which we will describe later.

• Functions With Maxsize will return a list with all the descriptions with every size from 1 to the

input value, together with the difference between the input value and the size of each description.

• Composition Function Lists will receive the length that the outputed list of descriptions should

have, the maximum size that all those descriptions combined will be able to have and, optionally,

the arity for the descriptions in the list; then it will construct all the combinations of descriptions that

29

Algorithm 7 Construction of a function list composed by functions with a given description size

function FUNCTIONS(size)if size ≤ 0 then passelse if size == 1 then yield Z(); yield S()

for i ∶= 1 to 3 dofor j ∶= 1 to i do yield P(i,j)

end forend for

elsefor composition in Compositions(size) do yield compositionend forfor recursion in Recursions(size) do yield recursionend for

end ifend function

function FUNCTIONS WITH MAXSIZE(size)for subsize ∶= 1 to size do

for func in Functions(subsize) do yield func, size-subsizeend for

end forend function

function COMPOSITION FUNCTION LISTS(length, size, arity = None)if length = 0 then

if size = 0 then yield []else passend if

elsefor function, remaining size in Functions With Maxsize(size) do

if arity = function.arity or arity == None thenfor sublist in Composition Function Lists(length-1,remaining size,function.arity) do

yield [function] + sublistend for

end ifend for

end ifend function

function COMPOSITIONS(size)for g, after g size in Functions With Maxsize(size−1) do

if g.arity > 0 thenfor function list in Composition Function Lists(g.arity,after g size) do

yield C(g,function list)end for

end ifend for

end function

function RECURSIONS(size)for function, size2 in Functions With Maxsize(size-1) do

for function2 in Function(size2) doif function2.arity == function.arity +2 then yield R(function, function2)end if

end forend for

end function

30

verify the given length and size and such that their arity is the same (if an input for arity is given

then only the functions with that arity will be outputed).

• Compositions will yield every possible description whose first symbol is C that has the given size.

These descriptions will be constructed using the lists obtained with Composition Function Lists

as the second argument of the description and such that the arity of the function in the first argu-

ment is the same as the length of the list. Moreover, the sum of the sizes of all the descriptions

in both arguments must be the size given decremented by one unit. In this case, we do not need

to worry about the arity of the main description since this procedure will cover every description of

every possible arity for the given size.

• Recursions will yield every possible description whose first symbol is R that has the given size and

where the first description in its argument has as arity the arity of the second description in the

argument plus two. Moreover, the size of both these descriptions must be the given size minus

one. Once again, we don’t need to be concerned about the arity of the main description due to the

same reason: this procedure will cover all descriptions of every possible arity for the given size.

With this in mind is easy to understand the behaviour of Functions and how it will yield the descrip-

tions of a given size. However, it still remains to explain the peculiarity of the projection function. Since

the arity of the projection symbol does not interfere with its size, we needed to establish an upper bound

for this parameter in order to be certain that the computation of the list of descriptions halts. It was de-

cided, for now, that that bound would be established at 3 because our goal was to explore this algorithm

mainly with unary and binary functions, and it is possible to express the most basic functions with said

arities only with projections up to arity 3 (as it can be seen in the Table 3.1); in case of need this bound

can be incremented or decremented. This decision aims to keep the length of the descriptions’ list to an

efficient dimension (since its construction is combinatorial) thus reducing the time needed to construct

said list.

In Figure 3.1 we can see examples of the lists of functions whose descriptions have size from 1 to 4.

It is visible the construction process for the descriptions, executed by combining smaller descriptions to

produce bigger ones. We can also see that the size of these lists increases very fast, which is explained

by the fact that the descriptions are constructed using a combinatorial reasoning. Note that there aren’t

any descriptions with size 2. This happens because the symbols used to construct the descriptions by

combining expressions for other descriptions (namely C and R) always need at least two more symbols,

and thus the descriptions we construct using these symbols have at least size 3.

We now have a functional scientist for the primitive recursive functions with arity up to 2 and thus are

within the conditions to experiment with the search algorithm.

3.5 An improved enumeration

Having a functioning enumeration is not all that we want. In fact, we want our search to be as efficient

as possible. Thus, our next step is to develop an enumeration that streamlines the search procedure.

31

Figure 3.1: List of all the functions whose descriptions have the referred size

It is possible to define an enumeration of the primitive recursive functions without repetition starting

from one that is exhaustive (proofs seen in [20] and [26]); although, constructing this enumeration is

“highly inefficient” (see [20]) and so we will just focus on trying to ameliorate the enumeration previously

implemented into one that is more efficient.

In our work, we will try to identify a primitive recursive function through input and correspondent

output data, which makes it possible for us to know the arity of the function we want a priori. This means

that we can make some changes in the search algorithm (Algorithm 6) and more importantly in the listing

of the primitive recursive functions (Algorithm 7) to have into account the arity of the function we want to

discover, listing only the functions of said arity, thus performing the search only in that set of functions.

This search procedure can be observed in Algorithm 8.

If we compare the Algorithms 6 and 8, we see that the changes are simple to identify: instead of

checking if the arity is the correct one after we have the list of functions (and discarding those that

don’t have it), we introduce this parameter into the arguments of the enumeration function Functions so

that the functions we are going to search through are already those with the correct arity. This implies a

bigger change in the algorithm that performs the enumeration of the functions, which we see in Algorithm

9. Besides this improvement, we can still reduce some redundancies in the enumeration of the primitive

recursive functions. If we prevent the existence of repetitions in the list of functions in the second

argument of a description with main symbol C, we will still have exhaustiveness when it comes to listing

descriptions for every primitive recursive function (within a certain arity, since that restriction is already

being taken into account). That prevention of repetition is made by the function inList used as auxiliary

function in the definition of Composition Function Lists, and defined in Algorithm 10. Analyzing it, we

32

Algorithm 8 Search algorithm for a primitive recursive function given the input/output values having intoaccount the arity

Input: inp as the list of tuples with the input values; out as the list of integers with the correspondingoutput values

Output: a description of a primitive recursive function

procedure SEARCH(inp,out)i ∶= 1a ∶= length(inp[0])for i ∈ N do:

for f in FUNCTIONS(i, a) dot ∶= Truej ∶= 1for j < length of inp do


breakend if


res ∶= fbreak

end ifend forif t is True then

breakend if

end forreturn res

end procedure

can see that this function will compare the description we want to add to a list of descriptions with every

description already in that list. That specific comparison will be made primarily through the main symbol

of the descriptions we are comparing:

• if they are not the same, then the descriptions are different;

• if they are the same and the main symbol is S or Z, then they are equal;

• if they are the same and the main symbol is P, then we need to check if the arguments n and i are

the same. If they both are, then the descriptions are equal but if one of them is not the same then

they are different;

• if they are the same and the main symbol is R, then we will apply recursively the function same to

the first elements of the two descriptions and then to the second ones as well;

• if they are the same and the main symbol is C, then we will apply the function same to the first

arguments of the description, we will check if the length of the list of functions in the second

argument of both descriptions is the same and we will apply the function same to the elements of

both lists pair by pair.

Returning to Algorithm 9, we will describe the changes we made from Algorithm 7. A general change

happens in the arguments of the functions: every function will have as argument the arity of the function

33

Algorithm 9 Construction of a function list composed by functions with a given description size and arity

function FUNCTIONS(size,ar)if size ≤ 0 then passelse if size == 1 then

if ar == 0 then yield Z()

else if ar == 1 then yield S()

end iffor i ∶= 1 to ar do yield P(ar,i)

end forelse

for composition in Compositions(size,ar) do yield compositionend forfor recursion in Recursions(size,ar) do yield recursionend for

end ifend function

function FUNCTIONS WITH MAXSIZE(size,ar)for subsize ∶= 1 to size do

for func in Functions(subsize,ar) do yield function, size-subsizeend for

end forend function

function COMPOSITION FUNCTION LISTS(length, size, ar)if length = 0 then


elsefor function, remaining size in Functions With Maxsize(size,ar) do

for sublist in Composition Function Lists(length-1,remaining size,ar) doif not inList(function,sublist) then: yield [function] + sublistend if

end forend for

end ifend function

function COMPOSITIONS(size,ar)for i ∶= 1 to size−2 do

for j ∶= 1 to size−2 dofor function list in Composition Function Lists(i, j,ar) do

for g in Functions(size−j − 1, i) do yield C(g,function list)end for

end forend for

end forend function

function RECURSIONS(size)for function, size2 in Functions With Maxsize(size−1, ar−1) do

for function2 in Functions(size2, ar+1) do yield R(function, function2)end for

end forend function

34

Algorithm 10 Function that indicates if a description is already in a list of descriptions

function INLIST(obj,objlist)b ∶= Falsefor obja in objlist do

b = same(obj,obja)if b == True then breakend if

end forreturn b

end function

function SAME(obj1,obj2)if type(obj1)! =type(obj2) then return Falseelse

if obj1 is S or Z then return Trueelse if obj1 is P then

if obj1.i ! = obj2.i or obj1.n ! = obj2.n then return Falseelse return Trueend if

else if obj1 is C or obj1 is R thenif same(obj1. g,obj2. g) and same(obj1. h,obj2. h) then return Trueelse return Falseend if

end ifend if

end function

we want to find. This will allow us to eliminate any operation regarding the verification of conditions

concerning the arity of the functions we are listing. We will now move on to analyzing the changes

specific to each function:

• In Functions we observe that, since we know the arity of the function we want, we don’t need

to list every description with size 1. Thus, if the arity is 0 we yield Z() and if it is 1 we yield S().

Besides this, but still for size 1, we will yield every description of every projection with the arity

given, which means that we will no longer need a constant boundary for the arity of the projection

function like we needed in Algorithm 7 (the boundary was 3 as explained before).

• In Functions With Maxsize there was no need to make further changes.

• For Composition Function Lists, argument of arity is now mandatory instead of optional. That

means that when we make the recursive call of the function, we know which value for the arity

we will provide, and thus we don’t need to check if the arity of the function resulting from applying

Functions With Maxsize is the correct one or not. Furthermore, we also introduced the function

inList to prevent the redundancies, as explained previously.

• Compositions is the function that demanded more changes. Because Functions With Maxsize

now needs as input a value for the arity of the function we are now prevented from finding the g

function first since the arity of g depends on the length of the list of functions h which we still don’t

know, which means that we will need to construct that list first. For that we need to call the function

35

Composition Function Lists; however, that function demands an input for its length and its size.

We can deduce upper boundaries for those attributes: the maximum size for that list will be the

size of the main description (which from now on we will only call size) minus the size of the symbol

C which stands at the head of the description (1) and minus the minimum size of the description of

function g (which has minimum size 1). Thus, the maximum size of the list of functions is size − 2.

Regarding the length of the function list, since its maximum size is size−2, its maximum length will

happen when every description is size 1 and so the maximum length will also be size − 2. For any

list of descriptions obtained through the previous procedure, we will combine it with every possible

and adequate description for function g. That will be done by calling function Functions, given as

size input size−j−1, where j is the value of the sum of the sizes of the descriptions in the list (since

size − j − 1 is the size the description for g has to have so that the sum of the sizes corresponds

to the initial size given as input), and as arity input the length of the list. In the end it yields the

description obtained by combining all these processes.

• In Recursions, knowing a priori the arity of the description also simplified this procedure, because

we know that the arity of the first description will be the arity of the main description decremented

by one while the arity of the second description will be the one of the main description incre-

mented by one, and so we don’t need to check the arities of the descriptions generated both in

Function With Maxsize and Functions.

As example, we have in Figure 3.2 a few lists of descriptions with small size and arity.

Analyzing these lists, and comparing them with the lists in Figure 3.1, there are some things to

observe. Firstly, there is concordance in both lists regarding the non existence of descriptions with size

2 (the reason to which has already been explored previously). Additionally, if we add the descriptions

by size, in order to be comparable with the lists in Figure 3.1, we see that there are more descriptions

for a certain size in this list than in the previous one. That is a result of the bound for the arity of the

projection description in the first enumeration; since the bound in the second enumeration is the arity

of the description then we have a higher bound (or a equal one for arities less or equal than 3), which

will result in more descriptions. The fact that there are more descriptions in the second enumeration

for each size does not make the following search less efficient as it may seem in first glance. In fact,

since we allow projections with arity bigger than 3, we will, in theory and in some cases, arrive to a

function’s description with lesser size than the one we wound obtain from performing the search with

the first enumeration. Moreover, it is noticeable that there are no repetitions between the descriptions in

each list belonging to a composition description. This is a direct result of applying the function inList in

the enumeration algorithm presented in Algorithm 9.

We now have a (theoretically) ameliorated enumeration to perform the search of a primitive recursive

function through the correspondent search algorithm (Algorithm 8).

36

(a) Size 1 and arity from 1 to 4 (b) Size 2 and arity from 1 to 4

(c) Size 3 and arity from 1 to 3 (d) Size 3 and arity 4

(e) Size 4 and arity from 1 to 3 (f) Size 4 and arity 4

Figure 3.2: Lists of descriptions with the referred size and arities

3.6 From description to code

Finding a description that identifies correctly a function that relates the inputs given with the correspon-

dent outputs is not the end of our work. Our goal is not only to find a function that relates the inputs with

37

the outputs but also to be able to predict outputs for other input values. To do so, we will need to find the

code for a program that computes the function described by the description found.

We have seen before in Theorem 3.1.1 that the set PRIM is exactly the set of functions that have

loop-programs (see [30]). This means that for every primitive recursive function it is possible to write a

program only with a sequence of assignments, if-clauses and sequential and nested for-loops. Thus,

our next step is to obtain said program in Python language for the found description. We developed a

program that will do such thing. This program will use an auxiliary list of variables so that it is possible for

us to perform the attributions needed for the program to function correctly. Plus, by construction, these

varibles will always be of one of two types: tuples or integers. This will be important to have in mind later

when we define the way of making the attributions and the operations regarding the variables. Getting

back to the program, it begins by writing in a text file the commands that define a function in Python:

“def function(x):” and then in the next line with the correct indentation we perform the attribution of

the input of the function to the first element of the list of variables: “a0 = x”. This list of variables will be

updated throughout the execution of this program so that the correct variable is used every time. Plus,

the changes of line and the indentations will also be employed in agreement with the correct ones. Then,

the program’s execution depends on what symbol of the description it reads:

• if the symbol is Z(), then it will attribute to the correct variable the value 0.

• if the symbol is S(), then it will be attributed to a new variable the value of the successor of the

adequate variable. However, before doing this operation we need to make sure that the variable is

an integer and not a tuple; if it is a tuple then it is one of only one element (due to the arities of the

operations in question) and then we say that the new variable is the successor of the first element

of the tuple.

• if the symbol is P(n,i), then the value of a new variable will be the i−th element of the appropriate

variable. Reversely to what happens in the previous case, we need to make sure that the variable

in question is a tuple; if it is not, then it is an integer and so, before performing the projection, we

need to transform the integer variable into a tuple variable and only then realize the projection.

• if the symbol is C(G,[H 1, ..., H k]) then the program will write the correct code for computing

the output of the functions with descriptions H 1,...,H k, attribute those values to the correct

variables, create a tuple variable composed of the different outputs of the previous k functions and

then attribute to the correct value the one computed by applying function with description G to that

tuple of values.

• if the symbol is R(G,H), then this program will first make sure that the input variable is a tuple, then

attributes to a new variable the tuple resulting of deleting the last element of the input tuple. It will

then write the code corresponding to computing function with description G. Finally, it will start a

for-loop that will be executed the same number of times as the value of the last element of the

input tuple and then it will write the code of function with description H with the correct input: the

variable resulting of deleting the last element of the input tuple with the number of the iteration of

38

the for-loop and another variable appended (this variable starts by being the output of function

with description G and then it will be updated as the result of executing this loop one time).

The last two possibilities are, of course, recursive, in the sense that when writing the lines of code for

the functions that are arguments of the symbols in question it will call the main function. In the end, this

program will write a line of code that allows the correct variable to be returned.

We present some examples of codes obtained from descriptions that are developed using this pro-

gram. Note that the descriptions given as example do not necessarily define one of the primitive re-

cursive functions that we identified previously and summarized in Table 3.1; we present very simple

descriptions that allow us to easily understand the operation of the program.

Example 3.6.1. We begin with the simplest of the examples: the code for the description Z(). The code

outputed by the program is given in the Figure 3.3.

Figure 3.3: Code for function with description Z()

We can see that the first two lines are the common beginning for the codes of all descriptions. Line

3 performs the attribution related to the symbol Z in the description and line 4 corresponds to the return

of the correct variable, in this case a1.

Example 3.6.2. The next example is also very simple, since it is the code of a program that computes

the function with description S(). That code can be seen in Figure 3.4.

Figure 3.4: Code for function with description S()

Once again, the first two lines of code correspond to the standard beginning for every program written

following this method. In line 3 we see the verification of the type of the variable a0: if it is not an integer

then is a (unary) tuple and thus the instruction is to add one to the first element of the variable. If it

is an integer, then we see in line 4 the instruction to add one to the variable a0. Lastly, line 5 has the

instrucction to return the variable a1.

Example 3.6.3. We will now see the code of a program that computes the function with description

P(3,1), which is present in Figure 3.5.

Figure 3.5: Code for function with description P(3,1)

39

Here, we see that in line 3, the instruction is to see if a0 is not an integer and then, if it isn’t, to

attribute to the new variable a1 the first element of the variable a0, which has the same value as the

function’s input. If a0 is an integer, then we transform it into a tuple and only then select the intended

element.

Example 3.6.4. Lets now analyze the resulting code for an example of a function with description

C(P(3,2),[P(1,1),S(),S()]) (Figure 3.6).

Figure 3.6: Code for function with description C(P(3,2),[P(1,1),S(),S()])

We can check that lines 3 to 8 correspond to the instructions to perform the computation for the

functions with description P(1,1), S() and S(), which are the descriptions in the list in the second part

of the argument of the symbol C in the original description. Then in line 9 we join these three variables

in a single tuple so that we can use it as input for computing the function whose description is in the first

part of the argument of the symbol C of the description (P(3,2)), and then performing said computation

in lines 10 and 11. Finally, we have the instruction that returns the value of the correct variable (a5) in

line 12.

Example 3.6.5. In Figure 3.7 we have the outputed code for the description R(Z(),P(2,1)).

Figure 3.7: Code for function with description R(Z(),P(2,1))

In line 3 we make sure that from that point on the input is a tuple. Next, we have the attribution of a

new variable as a tuple with the elements of the input tuple except the last one. In line 5, we perform the

attribution respective to the function with description Z(), which is the first argument of the description

symbol R. Then we begin a for loop that will be executed the number of times as the last element of the

input tuple. Inside the loop, in line 7 we perform the merger of the tuples in question and in lines 8 and

9 we perform the computation of the function whose description is the second part of the argument of

the original description, having the update of the correct variable in line 10. Lastly, we return the value

of the correct variable in line 11.

40

Chapter 4

A Restriction to E

We will now explore another way of attacking the problem of the search of expressions that describe

natural laws. We have been focusing on the primitive recursive functions to the learning environment of

our scientist. However it may not be necessary to have an environment of this type since we suspect

that every natural law can be express in a much simpler way: as an elementary function.

4.1 Elementary functions

We have already defined the set E of elementary functions in Definition 3.1.2. In fact, there are functions

that do not need to be expressed in the basis set of E because they themselves can be deduced by

applying the composition, bounded sum and/or bounded product operators to some simpler functions;

these functions are the addition and product operations. The construction of these functions using other

elementary functions is made the following way:

• prod(x, y) = xy = ∑z<y x

• add(x, y) = x + y = ∑z<y(((x.− z) .− xz) + 1)

For the product operation it is easy to understand what happens: we sum x y times. For the addition

it is a little bit more tricky; we want to add 1 to x y times. To do so, we perform a sum bounded by y such

that for z = 0 we have ((x .− 0) .− 0) + 1 = x + 1 and for 0 < z < y we have ((x .− z) .− xz) + 1 = 0 + 1 = 1 since

xz > x .− z for any z < y. This way, we will obtain x + y.

We show that the following functions are elementary by proving that they are the composition of

elementary functions:

• exp(x, y) = xy =∏z<y x

• pred(x) = x .− 1 = x .− (x + 1.− x)

• sg(x) =⎧⎪⎪⎪⎪⎨⎪⎪⎪⎪⎩

0, x = 0

1, x ≠ 0

= x .− (x .− 1)

41

• sg(x) = 1.− sg(x)

• dist(x, y) = ∣x − y∣ = (x .− y) + (y .− x)

• fact(x) = x! =∏z<x(z + 1)

• min(x, y) = x .− (x .− y)

• max(x, y) = (x .− y) + y

There is another operator we need to define, the bounded minimalisation.

Definition 4.1.1. Bounded Minimalisation

Let f be an (n + 1)-ary function such that f ∈ R. Let g be a new (n + 1)-ary function defined by the

expression

g(x1, . . . , xn, y) = µz < y(f(x1, . . . , xn, y) = 0) =⎧⎪⎪⎪⎪⎨⎪⎪⎪⎪⎩

the least z < y s.t. f(x1, . . . , xn, z) = 0 if such z exists

y otherwise

Then g ∈R. We call the operator µz < y a bounded minimalisation.

Lemma 4.1.1. (see [12]) Let f be an (n + 1)-ary function such that f ∈ E . Let g be a new function, also

(n+ 1)-ary, defined by the expression g(x1, . . . , xn, y) = µz < y(f(x1, . . . , xn, y) = 0). Then g ∈ E , i.e. E is

closed under bounded minimalisation.

In fact, we can write the bounded minimalisation as follows:

µz < y(f(x1, . . . , xn, z) = 0) = ∑v<y

∏u≤v

sg(f(x1, . . . , xn, u)) = ∑v<y

∏u<v+1

sg(f(x1, . . . , xn, u))

.

With this in mind, we can prove that the quotient function for integers is also elementary:

qt(x, y) =⎧⎪⎪⎪⎪⎨⎪⎪⎪⎪⎩

⌊ yx⌋, x ≠ 0

0, x = 0

= µz ≤ y(x = 0 or x(z + 1) > y)

= µz ≤ y(x = 0 or sg(x(z + 1) .− y) = 0)

= µz ≤ y(x × sg(x(z + 1) .− y) = 0)

= ∑v<y+1

∏u<v+1

sg(x × sg(x(z + 1) .− y))

We now wonder: if all these functions are elementary, then which primitive recursive functions are

not? To answer that, we present this statement:

Theorem 4.1.1. (see [12]) If f(x1, . . . , xn) is an elementary function, then there is a number k such that

for all x = (x1, . . . , xn),

f(x) ≤ 22⋰2max(x)

where the exponentiation is iterated k times.

42

Corollary 4.1.1. (see [12]) The function

f(x) = 22⋰2x

where the exponentiation is iterated x times is primitive recursive but it is not elementary.

We see that, even though the set of elementary functions is not equal to the set or primitive recursive

functions, the first contains the functions/operations that are indeed used to express most of the mathe-

matical equations that explain the natural laws. This means that we can reduce the scope of our search

to a set that still has everything we need but is much more easier to define: E . So, our next step will be

to develop a scientist that will have as learning environment the set of elementary functions.

4.2 Notation for representation

Once again, since we already have the set E defined we need to define a notation that represents

its functions. Thus, we have adapted the notion of description of primitive recursive function to be

able to represent only elementary functions in a simpler way. But first, we need to establish which

functions/operators are going to be in the base of the inference rules for this notation. We already

know that, following the Definition 3.1.2, we can construct any elementary function from the operations

x + 1, Pn,i, x.− y, x + y and xy and using the composition, bounded sum and bounded product operators.

Furthermore, we have proved that a great number of functions are elementary. This means that we can

consider these functions to be explicitly defined in the definition of notation for the elementary functions.

By choosing some key functions, we can expedite our search by a great amount of time. By analysing

the expressions of the functions written using the expressions of other elementary functions, we see that

there is only one that will actually be of great importance to define, since it is written by the composition

of a great amount of elementary functions/operators: the quotient function for integers. Thus we present

the following notation for the elementary functions:

Definition 4.2.1. Description for elementary functions

A description of an elementary function is an expression that is inductively defined as follows:

1. The symbol EZ() is a 0-ary description that describes the constant 0.

2. The symbol ES() is a unary description that describes the successor, i.e., the 1-ary function with

the expression S(x) = x + 1.

3. The symbol EP(n,i), for any n and i such that 1 ≤ i ≤ n is an n-ary description that describes the

projection, i.e., the n-ary function defined by the expression Pn,i(x1, . . . , xn) = xi.

4. The symbol EA() is a 2-ary description that describes the addition operation, that is the function

add(x, y) = x + y.

5. The symbol EM() is a 2-ary description that describes the subtraction operation for the natural

numbers, that is the function natminus(x, y) = x .− y = maxx − y,0.

43

6. The symbol ET() is a 2-ary description that describes the product operation, that is the function

prod(x, y) = xy.

7. The symbol ED() is a 2-ary description that describes the integer division operation, that is the

function qt(x, y) = qt(x, y) =⎧⎪⎪⎪⎪⎨⎪⎪⎪⎪⎩

⌊ yx⌋, x ≠ 0

0, x = 0

.

8. If G is a k-ary description, with k > 0, that describes the function g and if H 1, . . . , H k are n-

ary descriptions that describe the functions h1, . . . , hk respectively, with n ≥ 0, then EC(G,[H 1,

..., H k]) is an n-ary description that describes the n-ary function f defined by the expres-

sion f(x1, . . . , xn) = g(h1(x1, . . . , xn), . . . , hk(x1, . . . , xn)). We say that f is obtained from g and

h1, . . . , hk by composition.

9. If G is an n-ary description, with k > 0, that describes a function g then EBS(G) is an n-ary descrip-

tion that describes functions f defined by the expression f(x1, . . . , xn) = ∑z<xng(x1, . . . , xn−1, z).

We say that f is a bounded sum obtained from g.

10. If G is an n-ary description, with k > 0, that describes a function g then EBP(G) is an n-ary descrip-

tion that describes functions f defined by the expression f(x1, . . . , xn) = ∏z<xng(x1, . . . , xn−1, z).

We say that f is a bounded product obtained from g.

Just like for the description notation for the primitive recursive functions, the size of a description is

given by the number of symbols (EZ, ES, EP, EA, EM, ET, ED, EC, EBS and EBP) that make up the description.

Also, each description corresponds to only one elementary function but the reverse is not true: for a

single function there are infinitely many descriptions. We present some examples for descriptions of

elementary functions based on the rules written above:

Example 4.2.1. Lets start with the exponential function exp(x, y) = xy. We have already seen that

exp(x, y) = ∏z<y x, which can also be written as exp(x, y) = ∏z<y P2,1(x, y) and so a description for this

function is EBP(EP(2,1)).

Example 4.2.2. The next function is the predecessor function pred(x) = x .− 1, which can also be written

as pred(x) = x .− (x + 1.− x). This means we can compose the subtraction for the natural numbers with

the projection P1,1 and the subtraction of the successor with the projection P1,1. In terms of description,

we can write it as EC(EM(),[EP(1,1),EC(EM(),[ES(),EP(1,1)])]).

Example 4.2.3. Lets now analyze the function sg(x), which we have already seen that can be writ-

ten as x.− (x .− 1). Thus this function can be obtained from applying the subtraction operation to

the natural numbers to x and to its predecessor. In terms of description, we compose the descrip-

tion EM() with the descriptions EP(1,1) and the one for the predecessor function. So, we obtain

EC(EM(),[EP(1,1),EC(EM(),[EP(1,1),EC(EM(), [ES(),EP(1,1)])])]).

Example 4.2.4. We will now deduce a description for the function sg(x) = 1.−sg(x). We can furthermore

write this function as sg(x) = ((x + 1) .− x) .− sg(x). A description for the constant 1 can be written as

44

EC(EM(),[ES(),EP(1,1)]). We already have a description for the sg(x) function, and so we can obtain

the description for the sg function: EC(EM(),[EC(EM(),[ES(),EP(1,1)]),EC(EM(),[EP(1,1),EC(EM(),

[EP(1,1),EC(EM(),[ES(),EP(1,1)])])])]).

Example 4.2.5. Now for the distance function. We know we can write it as dist(x, y) = (x .− y) + (y .− x).This means that dist is the composition of the addition function with two subtractions, one of them with

swapped arguments. This resulting description is EC(EA(),[EM(),EC(EM(),[EP(2,2),EP(2,1)])]).

Example 4.2.6. The factorial function x! can be written as ∏x<z(z + 1), which means that this function

is the bounded product of the successor function. This is translated to the description EBP(ES()).

Example 4.2.7. The next one is the minimum function, given by the expression min(x, y) = x .− (x .− y)as we have seen before. This means that this function is the composition of the subtraction for natural

numbers with the projection P2,1 and with the subtraction for natural numbers of the arguments. Thus, a

description for this function is EC(EM(),[EP(2,1),EM()]).

Example 4.2.8. Our last example is the maximization function, that can be written as max(x, y) =(x .− y)+ y as we have already seen. This means that this function is the addition of the subtraction x .− ywith the second argument, which results in the description EC(EA(),[EP(2,1),EM()]).

All these conclusions are summarised in Table 4.1.

4.3 The search algorithm

Our next step, following the reasoning of Chapter 3, is to define a search algorithm for this set of func-

tions. However, just like in Section 3.3, we will first need to address the enumerability of E .

Proposition 4.3.1. The set of the elementary functions (E ) is recursively enumerable.

Proof. This proof will follow the structure of the one of Proposition 3.3.1. We will construct a correspon-

dence q between the symbols in Definition 4.2.1 and the natural numbers in the following way:

• q(EZ()) = ⟨0⟩

• q(ES()) = ⟨1⟩

• q(EP(n,i)) = ⟨2, n, i⟩

• q(EA()) = ⟨3⟩

• q(EM()) = ⟨4⟩

• q(ET()) = ⟨5⟩

• q(ED()) = ⟨6⟩

• q(EC(G,[H 1,...,H k])) = ⟨7, k, l, q(G), q(H 1), . . . , q(H k)⟩, where l is the arity of the description.

45

FunctionD

escriptionexp(x

,y)=xy=∏z<yx

EBP(EP(2,1))

pred(x)=

x.−

1=x

.−(x+

1.−x)

EC(EM(),[EP(1,1),EC(EM(),[ES(),EP(1,1)])])

sg(x)=⎧⎪⎪⎨⎪⎪⎩

0,x=

0

1,x≠

0=x

.−(x.−

1)EC(EM(),[EP(1,1),EC(EM(),[EP(1,1),EC(EM(),[ES(),EP(1,1)])])])

sg(x)=1

.−sg(x)=

((x+

1).−x)

.−sg(x)

EC(EM(),[EC(EM(),[ES(),EP(1,1)]),EC(EM(),[EP(1,1),EC(EM(),[EP(1,1),EC(EM(),[ES(),EP(1,1)])])])])

dist(x

,y)=∣x−y∣=

(x.−y)+

(y.−x)

EC(EA(),[EM(),EC(EM(),[EP(2,2),EP(2,1)])])

fact(x)=

x!=∏z<yz+

1EBP(ES())

min(x

,y)=x

.−(x.−y)

EC(EM(),[EP(2,1),EM()])

max(x

,y)=(x

.−y)+

yEC(EA(),[EP(2,1),EM()])

Table4.1:

Elem

entaryfunctions

andtheircorresponding

descriptions

46

• q(EBS(G)) = ⟨8, l, q(G)⟩, where l is the arity of the description.

• q(EBP(G)) = ⟨9, l, q(G)⟩, where l is the arity of the description.

By using the encoding function τ in Proposition 3.3.1, from [12], we can conclude that each tuple will

be encoded into a different natural number. If we consider the inverse correspondence of q, from the

naturals to the descriptions of the elementary functions, and to the numbers that are not in the range

of q we attribute them the description EZ(), we see that we have an enumeration for E , and so E is

recursively enumerable.

Proposition 4.3.2. E ∈ Ex.

Proof. To perform this proof we only need to present a scientist that Ex-identifies this set of functions

(Algorithm 11, adapted from Algorithm 5). Let π ∶ N → E be an enumeration for the primitive recursive

functions, πi denote the function obtained from π(i) and ψ the elementary function explained by a text

in the canonical form with prefix σ.

Algorithm 11 Scientist that Ex-identifies E

functionM(σ ∶ SEG)∶ Ni ∶= 0;k ∶= 0;n ∶= length of σwhile k < n do

if πi(k) = ψ(k) then k ∶= k + 1else

k ∶= 0;i ∶= i + 1;

end ifend whilereturn i

end function

This algorithm searches for the least i ∈ N such that the output of πi applied to every k < n has the

same value obtainable from σ for ψ(k). Since π is an enumeration for the set E , then this algorithm

will be exhaustive in the set of elementary functions and since σ is a prefix of a text that explains an

elementary function, this algorithm will eventually halt.

The implementation of the search algorithm will be based in Algorithm 8. We will consider the ex-

istence of a function ElFunctions (which will be defined in Section 4.4) that, given a size and an arity

for the functions, will output a list of every possible description with those values for size and arity, using

the rules in Definition 4.2.1. This will result in Algorithm 12 that will follow an identical reasoning for the

search algorithm in 8: for an arity known a priori we will list all the descriptions with size 1 and see if

there is a function in that list such that when applied to every value given in the inp list it will return the

values in the out list, respectively. If there is not a function with description’s size 1, then we will proceed

to check the list of descriptions with size 2, then the one with descriptions with size 3, and so on until we

find a description that explains correctly the values given in the input lists inp and out.

We now have an algorithm that is dependent on the implementation of function ElFunctions to be

functional, and so our next step is to implement that function.

47

Algorithm 12 Search algorithm for an elementary function given the input/output values having intoaccount the arity

Input: inp as the list of tuples with the input values; out as the list of integers with the correspondingoutput values

Output: a description of an elementary function

procedure SEARCH(inp,out)i ∶= 1a ∶= length(inp[0])for i ∈ N do:

for f in ElFunctions(i, a) dot ∶= Truej ∶= 1for j < length of inp do


breakend if


res ∶= fbreak

end ifend forif t is True then

breakend if

end forreturn res

end procedure

4.4 Enumeration

In Algorithm 13 we see an implementation for ElFunctions and for the necessary auxiliary functions,

which was mainly based on Algorithm 9 with the necessary changes made:

• Besides yielding the descriptions for the zero constant, the successor and the projections with

given arity, for arity 2 the algorithm also yields the descriptions of the addition, the subtraction for

natural numbers, the product and the integer division operations.

• It no longer needs to have a constructor function regarding the recursion operator.

• Functions ElFunctions With Maxsize and ElCompositions are identical and work the same way

as Functions With Maxsize and Compositions in Algorithm 9, respectively.

• ElComposition Function Lists will work identically to Composition Function Lists in Algorithm

9. However, in this case we will not apply the restriction to not have in the yielded lists dupli-

cated descriptions, since it will be more useful in this case to allow such repetitions to occur (for

example, to express a description for the square function f(x) = x2, by allowing duplicate de-

scriptions in the previously mentioned lists we can have a description for this function such as

EC(ET(),[EP(1,1),EP(1,1)]); if we didn’t allow repetitions to occur, a description for f would be

48

Algorithm 13 Construction of a function list composed by elementary functions with a given descriptionsize and arity

function ELFUNCTIONS(size,ar)if size ≤ 0 then passelse if size == 1 then

if ar == 0 then yield EZ();else if ar == 1 then yield ES()

else if ar == 2 then yield EA(); yield EM(); yield ET(); yield ED();end iffor i ∶= 1 to ar do yield EP(ar,i)

end forelse

for composition in ElCompositions(size,ar) do yield compositionend forfor boundedsum in ElBoundedSums(size,ar) do yield boundedsumend forfor boundedprod in ElBoundedProds(size,ar) do yield boundedprodend for

end ifend function

function ELFUNCTIONS WITH MAXSIZE(size,ar)for subsize ∶= 1 to size do

for func in ElFunctions(subsize,ar) do yield function, size-subsizeend for

end forend function

function ELCOMPOSITION FUNCTION LISTS(length, size, ar)if length = 0 then


elsefor function, remaining size in ElFunctions With Maxsize(size,ar) do

for sublist in ElComposition Function Lists(length-1,remaining size,ar) doyield [function] + sublist

end forend for

end ifend function

function ELCOMPOSITIONS(size,ar)for i ∶= 1 to size−2 do

for j ∶= 1 to size−2 dofor function list in ElComposition Function Lists(i, j,ar) do

for g in ElFunctions(size−j − 1, i) do yield C(g,function list)end for

end forend for

end forend function

49

function ELBOUNDEDSUMS(size,ar)if ar== 0 then passelse

for f in ElFunctions(size-1,ar) do yield EBS(f)end for

end ifend function

function ELBOUNDEDPRODS(size,ar)if ar== 0 then passelse

for f in ElFunctions(size-1,ar) do yield EBP(f)end for

end ifend function

a lot longer).

• ElBoundedSums and ElBoundedProds are two new functions that work in a similar way: they will first

check if the arity is not null and, if it isn’t, for any function f with description F in the list of functions

with the same arity but with size one unit smaller ElBoundedSums will yield the description EBS(F)

and ElBoundedProds will yield the description EBP(F).

.

In Figure 4.1 we have as example the descriptions with size and arity between 1 and 3. By compari-

son with lists in Figures 3.1 and 3.2 we see that now there are more descriptions with smaller size. This

happens because we have more symbols defined a priori which originates a greater number of possible

combinations for small size values; in fact, due to the symbols EBS() and EBP() we are now able to

have descriptions with size 2, which were non existent in the previous two enumerations. This piece

of information, along with the fact that there are now more operation identified with symbols defined a

priori, makes us expect that with this learning paradigm we will be able to achieve a conjecture much

faster than with the ones presented in Chapter 3.

4.5 From description to code

Since our definition of description for elementary functions (Definition 4.2.1) is different than the one

presented in Definition 3.2.1, the reasoning used in Section 3.6 to obtain the code of a program that

computes the function defined by a description cannot apply directly to these descriptions and so we

will have to make the necessary changes. Since the set E is contained in the set PRIM (in [12] and

already discussed in Chapter 3) and we know that for all the primitive recursive functions there exists

a program written only with a sequence of assignments, if-clauses and sequential and nested for-loops

(Theorem 3.1.1 and in [30]), then it will also be possible to obtain a program with these characteristics for

the elementary functions (once again, these programs will be written in Python language). We will thus

develop a program that transforms a description for elementary functions such as defined in Definition

4.2.1 into a Python program. In its structure, this program will be a lot similar to the one described in

50

(a) Size 1 and arity from 1 to 3 (b) Size 2 and arity from 1 to 3

(c) Size 3 and arity 1 and 3 (d) Size 3 and arity 2

Figure 4.1: Lists of descriptions for elementary functions with the referred size and arities.

Section 3.6: it uses an auxiliary list of variables in order to make possible the attributions needed for the

program to function correctly. These variables will once again always be tuples or integers.

Much like the program in Section 3.6, this one will write in a text file, beginning by writing in the first

two lines “def function(x):” and “a0 = x”, with the correct indentation. Throughout the program the

auxiliary list of variables will be updated, to allow the use of the correct variable every time, just like the

changes of line and the indentations. We know arrive to the differences between the two programs: the

execution depending on the symbol it reads. The rules will be the following:

• if the symbol is EZ(), then it will attribute to the correct variable the value 0.

51

• if the symbol is ES(), then it will be attributed to a new variable the value of the successor of the

adequate variable after making sure that the variable is an integer and not a tuple; if it is not an

integer then it is a tuple of only one element (due to the arities of the operations in question) and

then we say that the new variable is the successor of the first element of the tuple.

• if the symbol is EP(n,i), then the value of a new variable will be the i−th element of the appropriate

variable after making sure that that variable is a tuple; if it is not, then it is an integer and so, before

performing the projection, we need to transform the integer variable into a tuple variable and only

then perform the projection.

• if the symbol is EA(), then the value of the new variable will be the addition of the two elements of

the pair that compose the previous variable. In this case we don’t have to worry about the type of

the variable since the description will have arity 2 and so the variable in question will obligatory be

a tuple (a pair, to be precise).

• if the symbol is ET(), then the value of the new variable will be the product of the two elements of

the pair that compose the previous variable. Once again and for the same reasons as for EA() we

don’t have to worry about the type of the variable.

• if the symbol is EM(), then the value of the new variable will be the subtraction of the two elements

of the pair that compose the previous variable. Once again, there is no need to worry about the

type of the variable.

• if the symbol is ED(), then the value of the new variable will be the quotient of the second element

of the pair over the first, if the first is not 0; if it is, then it will attribute to the new variable the value

0 (this is done in order for it to be a total function).

• if the symbol is EC(G,[H 1, ..., H k]) then the program will write the correct code for computing

the output of the h 1,...,h k functions, attribute those values to the correct variables, create a

tuple variable composed of the different outputs of the previous k functions and then attribute to

the correct value the one computed by applying function G to that tuple of values.

• if the symbol is EBS(G) the program will first make sure that the input variable is a tuple (if it is not, it

will turn it into one), attributes to a new variable a tuple with the values of the input variable except

the last one and then it will give to a new variable the value 0 that will be updated throughout the

for-loop that will be written after (meaning that when the last element of the input pair is 0 and

the for-loop does not execute, the returned value will be 0). This for-loop will be executed the

same number of times as the value of the last element of the input variable and it will execute the

following commands: for the i-the execution, a new tuple variable will be created that will have as

values the same as the input variable but with the last one swapped by i; then it will execute the

code of respective to description G and update the variable declared before the for-loop began

(that started with the value 0 assigned) by adding to its current value the result of the application

of function defined by G; in the end, it returns this updated variable’s value.

52

• if the symbol is EBP(G) the program will act very similarly to when the symbol is EBS(G): the only

changes are that the variable that has initially value 0 in the previous case is initialized with the

value 1 (which will make the result 1 when the for-loop is not performed, i.e. when the last element

of the input pair is 0) and that this value is updated by multiplying its value with the value obtained

after the application of the code relative to description G instead of being added.

In the end, this program will write a line of code that allows the correct variable to be returned.

It is easily seen that the way this program functions for the symbols EZ(), ES(), EP(n,i) and

EC(G,[H 1,...,H k]) functions in the same way than the program in Section 3.6 for Z(), S(), P(n,i)

and C(G,[H 1,...,H k]), respectively. We will now proceed to demonstrate how this program works

for the other symbols. Just like in Section 3.6 this descriptions are simple ones; their only purpose is to

easily show how the translation from the description to code works.

Example 4.5.1. The first example will be the one with the description EA(). We see the respective code

in Figure 4.2.

Figure 4.2: Code for function with description EA()

In lines 1 and 2 we see the standard beginning for every program obtained this way. In line 3, the

addition is performed and in line 4 it is written the command for the return of the adequate variable.

Example 4.5.2. Next, in Figure 4.3 we see what happens when the description is EM().

Figure 4.3: Code for function with description EM()

After the common two lines of code, in line 3 we verify if the first element of the input pair is bigger

than the second one and the instruction to, if it is, proceed to the subtraction. In line 4 we see the

instruction of what to do if the if guard in the previous line fails: it attributes to the new variable the

value 0. In line 5 we have the return of the appropriate variable.

Example 4.5.3. The next example is visible in Figure 4.4 and is related to description ET().

Figure 4.4: Code for function with description ET()

It is a simple one: in line 3 we see the product operation being applied to the two elements of the

input pair and in line 4 the return of that result.

53

Figure 4.5: Code for function with description ED()

Example 4.5.4. We will now proceed to analyze in Figure 4.5 what happens for description ED().

We have in line 3 the verification of if the first element of the input pair is 0 and the commands for what

to do if it is: attribute to the new variable the value 0. In line 4 we have what happens if the verification

fails. We simply perform the integer division (represented by //) of the last element of the pair by the

first. In line 5 we have the return of this value.

Example 4.5.5. Our next example will be the one in Figure 4.6 regarding description EBS(ES()).

Figure 4.6: Code for function with description EBS(ES())

In line 3 we see the code that makes sure that the input variable is a tuple. Next, in line 4 it is created

a new tuple with the same values as the previous one except the last one. It is created a new variable a2

with value 0 in line 5 and then in line 6 we proceed to define the for-loop. This loop will be executed the

same number of times as the last element of the input variable and will perform the following instructions:

in line 6 it is created a variable composed by the elements of the input tuple with the last one substituted

by i (which is the number of the iteration of the loop). That variable will be the input for the code in lines

8 and 9 relative to the function with description ES() and then the variable initialized as 0 before the

for-loop began will be updated with the result of the previous lines of code by adding this value to the

value it already had. Once again, in the end we see the return instruction for the correct variable.

Example 4.5.6. Lastly we have what happens for description EBPS(ES()) in Figure 4.7.

Figure 4.7: Code for function with description EBP(ES())

By analyzing this example with Example 4.5.5 we see that what happens is extremely similar, except

in line 5 where the variable is initialized as 1 and in line 10 where the variable is updated by performing

the product of the values of the variables instead of its addition.

54

Chapter 5

Results

We will now put our algorithms to the test. To do so it was used a computer with operating system

Windows 10 and processor Intel(R) Core(TM) i5-4210U CPU @ 1.70 GHz 2.40 GHz, with the Python

version 3.7.0 installed. In appendix B it is explained in which files the algorithms are implemented and

where they can be found.

In the second (Section 5.2) and the third (Section 5.3) algorithms something peculiar happens: it was

noticeable that sometimes our algorithms got stuck in some descriptions, i.e. they took a lot of time to

obtain the result of applying the current description to some input values inserted. To understand what

was happening, we added a line of code that printed the current description next to its size. However,

this operation somewhat slowed (a lot) the computation of the algorithms and thus we only enabled it for

provisional results; every computational time presented was obtained without this functionality enabled.

Our methodology will be the following: we run the algorithm for inp and out values that explain the

functions whose descriptions we already know (Table 3.1). Then we will explore the algorithms beyond

these functions to see their limitations. Furthermore, we start by providing only one element in each of

the inp and out lists and if the scientist returns a description that doesn’t match the one we want to, we

will slowly add information to both lists until we find the intended description.

Before beginning to see the scientists in action we need to present one definition that will aid us to

understand and explain our results: the notion of locking sequence.

Definition 5.0.1. (see [32] and [11]) Locking Sequence

Let ψ ∈ R, M a scientist and σ ∈ SEG. We say that σ is a locking sequence for M on ψ if (a)

content(σ) ⊂ ψ, (b) φM(σ) = ψ and (c) for all τ ∈ SEG such that if content(τ) ⊂ ψ thenM(σ τ) =M(σ).

This concept is important because from the moment a scientist founds a locking sequence σ, it

converges immediately in all texts for ψ having σ as prefix, and thus if a small locking sequence is found,

then the scientist will converge very fast and will do so for many different inputs.

Lets now proceed to the presentation of results.

55

5.1 First algorithm

In this section we will analyze the implementation of the search procedure in Algorithm 6 that uses the

enumeration described in Algorithm 7.

Experiment 5.1.1. Our first function is id ∶ N → N defined by the expression id(x) = x. The goal is that

our scientist returns the description P(1,1). In Figure 5.1 we can see that by providing as inp list the list

of tuples [(1, )] and as out list the list of integers [1], the scientist finds the description P(1,1), which we

know corresponds to the identity function, in 0.003996 seconds. This means we only needed a prefix

with length one to find a locking sequence for this scientist on this function. We tested this result for

input values (3, ), (4, ) and (5, ) and, as expected, the results were 3, 4 and 5.

(a) Lists of inp and out (b) Description found and thetime it took in the bottom

(c) Code obtained

Figure 5.1: Results for the identity function searched by the first scientist

Experiment 5.1.2. The next experiment relied on observing the behavior of the algorithm regarding the

function s ∶ N → N defined as s(x) = x + 1. By providing as inp list [(1, )] and as out list [2] the algorithm

returned the description S() (which obviously describes the function in hand) in 0.000996 seconds. To

test the outputed description/code we provided as input (2, ), (3, ) and (4, ) which resulted in outputs 3,

4 and 5, as anticipated.

Experiment 5.1.3. The function pred ∶ N → N defined by the expression pred(x) = x.− 1 is the next

one. We provide the algorithm the lists [(1, )] (inp list) and [0] (out list) and observe the outcome: the

procedure terminates in approximately 0.004997 seconds and returns a description for the predecessor

function, R(Z(),P(2,1)). We provided as input the values (0, ), (2, ) and (7, ) to perform the test which

resulted in the outputs 0, 1 and 6, as we thought it would.

Experiment 5.1.4. Now we try to realize if the algorithm can identify the unary zero function. By the

previous experiment, we know that we should not provide the same inp and out lists because that way

the scientist will not return the description of the zero function but the description of the predecessor

function. Thus, by providing as inp list [(2, )] and as out list [0] the procedure returned the description

R(Z(),P(2,2)) which the Table 3.1 tells us it is a description of the function zero ∶ N → N defined as

zero(x) = 0. This computation terminated in approximately 0.003999 seconds. We experimented with

values (7, ) an (14, ), which resulted in the output values 0 and 0 as we hoped it would.

Experiment 5.1.5. Lets now see what happens with the function sx ∶ N2 → N defined as sx(x, y) = x+ 1.

Providing as inp list the list [(2,4)] and as out list the list [3], we can see (Figure 5.2) that the scientist

56

finds an appropriate description associated with this function: C(S(),P(2,1)). This computation took

approximately 0.001956 seconds. By testing with other input values ((3,5) and (7,2), which outputed

respectively 4 and 8) we have stronger reasons to belief that the description is adequate.


(c) Code obtained

Figure 5.2: Results for the successor function after the projection of the first argument searched by thefirst scientist

Experiment 5.1.6. We proceeded to analyze the results for the function add ∶ N2 → N defined by the

expression add(x, y) = x+y. By providing the algorithm with [(2,3)] as inp list and [5] as out list, we see

that the algorithm terminates in 0.004942 seconds and returns the description C(S(),C(S(),P(2,2)))

which corresponds to the function defined as the successor of the successor of the second argument,

which was not the outcome we pretended (proved by the fact that the test made with input (9,7) re-

sulted in output 9 instead of 16). Thus there was a need to provide more information, which was done

by extending the inp and out lists with other data points. This means that, for the first time, the infor-

mation we provided to the scientist at the first attempt was not a locking sequence for the scientist on

the function in hand. Our next attempt at this function was made with the lists [(2,3), (1,5)] and [5,6]as inp list and out list, respectively. This computation already terminates in the expected description

R(P(1,1),C(S(),P(3,3))) (see Table 3.1), taking 0.012939 seconds to do so (Figure 5.3). By testing

the input pair (5,8), which resulted in output 13, we obtain more evidence that the description is an ade-

quate one. Furthermore, we wanted to see the behaviour of the scientist with an inp list with more pairs

and/or with elements of greater value. Thus, we first gave the scientist inp list [(1,4), (2,1), (3,2), (0,6)]and out list [5,3,5,6], followed by providing as inp list the list [(13,24), (35,41), (133,256), (420,513)]and as out list the list [37,76,389,933]. The expression returned in both cases was the same as in the

last attempt; the first needed 0.015591 second to terminate while the seconds took 0.067953 seconds

to do so.

Experiment 5.1.7. Next we experiment with the function sub ∶ N2 → N defined as sub(x, y) = x.− y.

By firstly providing as inp and out lists the lists [(5,2)] and [3], respectively, the procedure returns the

description C(S(),[P(2,2)]) in 0.001001 seconds, which corresponds to the function that calculates

the successor of the second argument. In fact, this function also explains correctly the information pro-

vided to the algorithm but if we test for other values it is not what we pretended (for example for input

(7,2) it outputs 3 instead of 5) and so, like we did in the previous example, we will need to increase

the data given to the algorithm. Now we want to expand our inp and out lists so that the scientist

understands that this description is not the one we want. This means that by adding to our inp list

57


(c) Code obtained

Figure 5.3: Results for the second attempt for the addition function searched by the first scientist

the tuple (2,0) and to our out list the element 2, we observe that the scientist returns the description

R(P(1,1),C(S(),[C(S(),[P(3,2)])])) (Figure 5.4), which still is not an adequate description for this

function, proven by the tests performed: for input tuple (6,4) instead of returning the result 2 it outputed

the value 5. We then proceeded to a third attempt to find a desired description by providing the sci-

entist the list [(5,2), (2,0), (4,1)] as inp list and the list [3,2,3] as out list. This time the result we got

was a good one, with the scientist returning the description R(P(1,1),C(R(Z(),P(2,1)),[P(3,3)]))

in 0.432695 seconds, which is the description deduced in Example 3.2.4. After this we tried again with

only two points in order to see if it was possible to find a locking sequence with length two, opposed to

the previous one which had length three. To do so, this time our inp and out lists were [(5,2), (4,1)]and [3,3] respectively. We then observe that the scientist found the intended description in 0.495662

seconds, as we can see in Figure 5.5. Furthermore, we performed another attempt for input pair in a

bigger number with greater elements, so we provided inp list [(34,12), (25,40), (151,72), (627,728)] and

with out list [22,0,79,0]. This resulted in the scientist returning the same description as in the previous

attempt, which took it 4.57001 seconds to find.

(a) Lists of inp and out (b) Description found and the time it tookin the bottom

Figure 5.4: Results for the second attempt for the subtraction function searched by the first scientist

Experiment 5.1.8. We then tried to see if the scientist was able to find the description of the prod-

uct function prod ∶ N2 → N defined as prod(x, y) = x × y. Our first attempt was to see if it was

possible with the inp list [(2,3)] and the out list [6]. After 0.014799 seconds, the scientist returned

58

(a) Lists of inp and out (b) Description found and the time ittook in the bottom

(c) Code obtained

Figure 5.5: Results for the repetition of the second attempt for the subtraction function searched by thefirst scientist

the description R(S(),C(S(),[P(3,3)])), which besides not being the intended description (see Ta-

ble 3.1) it does not output the correct value when tested for input (6,4) since it outputs 11 instead of

24, making us conclude that this description is not adequate to describe this function. We then en-

larged the lists provided to the scientist and we gave as inp list [(2,3), (5,2)] and as out list [6,10].The scientist took 8495.14 seconds (approximately 2 hours and 20 minutes) to find the description

R(R(Z(),P(2,2)),C(R(P(1,1),C(S(),[P(3,3)])),[P(3,1),P(3,3)])), which is an appropriate one

to describe the product function since it is equal to the one deduced in Example 3.2.5. This is corrobo-

rated by performing the test to the input values (5,6) and (7,3) (which returned 30 and 21 respectively).

Remark: Do not worry with the computational time for this experiment; the algorithm used in Section

5.2 will present computational times much more adequate for this search (see Experiment 5.2.3) and

the one in Section 5.3 already has a description for the product function defined a priori.

Due to the results of the last experiment, it is obvious that this algorithm is not very efficient, since

it took almost two and a half hours to find the description of a function as simple as the product. Thus,

since the description for the distance function is much bigger than the one for the product we will not

proceed to experiment with that function. We will now perform some experiments for functions to which

we don’t know descriptions a priori.

Experiment 5.1.9. An experiment was made to see if the scientist would find a description for values

obtained through the function f ∶ N→ N defined as f(x) = 2x. We first provided [(0, )] and [0] as inp and

out lists, respectively; it returned the description for the identity function P(1,1) which is obviously not an

intended result. We then added one element to each list in order to provide as inp the list [(0, ), (1, )] and

as out the list [0,2]; it returned the description R(Z(),C(S(),[C(S(),[P(2,1)])]))). By testing this re-

sult with other values, we see that this is still not a description for the function in question, since it outputs

7 for a given value of (6, ) and 9 for input (8, ) (in fact, this description describes the function that outputs

0 if the input is 0 and the successor of the input otherwise). We then augmented the inp and out lists one

more time to [(0, ), (1, ), (2, )] and [0,2,4]; this time, it returned R(Z(),C(S(),[C(S(),[P(2,2)])])),

59

(a) Lists of inp and out (b) Description found and the time it took in the bottom

(c) Code obtained

Figure 5.6: Results for the second attempt for the product function searched by the first scientist

which we tested for other values. For input 12 it returned 24 and for input 25 returned 50, which gives

us confidence that this is an adequate description for the function f in question. It took the scientist

0.658629 seconds to do so. We then saw what happened if we inserted only one element in each list,

but a bigger one. For example, for the pair of values for inp and out [(3, )] and [6] it returned the de-

scription C(S(),[C(S(),[S()])]) which is simply the successor applied three times and it is not what

we pretended. However, for inp and out [(12, )] and [24], the description returned was the one we were

looking for: R(Z(),C(S(),[C(S(),[P(2,2)])])); this execution took 0.730134 seconds to halt, which

means that, with a small loss over the computational time, we can find a smaller locking sequence for

this experiment.

Experiment 5.1.10. Our next experiment is the one regarding the function f ∶ N → N which is de-

60

fined as f(x) = (x + y) .− 1. We began by inserting [(1,2)] as inp and [2] as out. As expected, it

returned the description for the projection of the second element of the pair, P(2,2). We then pro-

ceeded to enlarge both the inp and out lists to [(1,2), (2,1)] and [2,2], respectively, which resulted in

the scientist returning the description R(R(Z(),P(2,1)),C(S(),[P(3,3)])). We tested this result for

other input values: for (12,4) it returned 15 and for (35,21) the output was 55, which are the correct

outputs for applying the function in question to said input values. So we conclude that, to the best

of our knowledge, this last description is one that explains the function f(x) = (x + y) .− 1 adequately

and we observe that the scientist took 0.995186 seconds to find it. Furthermore, we once again tried

to find this description using only one element in each list: we succeeded for inp list [(19,8)] and out

list [26], for a computation time of 0.825140 seconds. This time, finding a smaller locking sequence

actually resulted in a gain when it comes to the computational time of the procedure. We also pre-

tend to observe the result of providing bigger lists and/ot lists with greater values to the scientist, and

so we first provided the inp list [(2,0), (5,4), (2,2), (1,3)] and out list [1,8,3,3] and then we provided

[(25,31), (48,57), (237,192), (540,371)] as inp list and [55,104,428,910] as out list. This resulted in the

output of the same description in both cases as in the last two attempts, in computations that took

0.611785 and 2.78675 seconds to terminate, respectively.

We present a summary of the experiments performed in Table A.1.

5.2 Second algorithm

Now we present the results obtained by executing the search algorithm in Algorithm 8, which used the

enumeration procedure in Algorithm 9. We will not repeat the first experiments presented in Section 5.1,

since those results will be extremely similar to the ones presented there; we want to see the differences

between this algorithms in the cases where the first one had more difficulties to return an adequate

description. In this algorithm, we already have descriptions with projections with arity greater than

3, which increases the possibility of the scientist finding appropriate descriptions for the concerned

functions that are different from the ones in Table 3.1.

Experiment 5.2.1. Our first experiment is referred to the addition function. We provided as inp list

[(2,3)] and as out list [5], which resulted in the scientist returning C(C(S(),[S()]),[P(2,2)]); although

this is not the same one as the first description obtained in Experiment 5.1.6, it performs the same oper-

ation: the successor of the successor of the second argument of the input pair. Once again, by enlarging

both inp and out lists to [(2,3), (1,5)] and [5,6], respectively, we find, in only 0.002997 seconds, an ad-

equate description (that is equal to the one found in Experiment 5.1.8): R(P(1,1),C(S(),[P(3,3)])).

Experiment 5.2.2. For the subtraction function for the natural numbers, we began by providing the

same inp and out lists as in Experiment 5.1.7: [(5,2)] and [3], respectively. It returned the same result,

C(S(),[P(2,2)]). We then increased the lists the same way, [(5,2), (2,0)] for inp and [3,2] for out.

This time, the returned description was R(P(1,1),R(P(2,1),P(4,3))). Since we don’t know to what

function this description is related to, we tested it to understand its behaviour. We saw that for pairs (x, y)

61

such that x > y the obtained result explained correctly the subtraction function: for input (8,3) returned 5

and for input (15,7) returned 8. However, when testing for (4,7), for example, it outputed 2 instead of 0,

leading us to conclude that this description is not an adequate one. To prevent that from happening, we

chose specific values to add to the inp and out lists, for example (4,7) to inp and 0 to out. This resulted

in the return of the description R(P(1,1),C(R(Z(),P(2,1)),[P(3,3)])), which we know is adequate to

explain the function in hand (see Table 3.1). This computation took 0.041919 seconds to terminate. The

same way, just by giving as inp [(5,2), (4,7)] and out [3,0] the scientist was still able to find the correct

description in approximately the same computational time. Also, we wanted to see what happened

when the inp list provided was composed with more input pairs and with bigger value inputs. Thus, we

first gave the scientist the list [(2,1), (3,6), (4,0), (5,2)] as inp list and the list [1,0,4,3]; the scientist

returned the same description as the one in the last attempt after 0.031918 seconds of searching. Then,

we provided the scientist with inp list [(20,10), (15,7), (34,57), (60,61)] and with out list [10,8,0,0] what

resulted in the scientist outputing the previous description as the last two attempts, in a computation that

took 0.578816 seconds.

Experiment 5.2.3. Now regarding the product function. We also began with inp list [(2,3)] and out list

[6], like in Experiment 5.1.8. This returned the description R(P(1,1),C(R(Z(),P(2,1)),[P(3,3)])),

which we already saw in Experiment 5.1.8 was not an adequate description for this function. We then

proceeded to enlarge the inp and out lists in the same way: to [(2,3), (5,2)] and [6,10]. With this

information, the scientist returned the description R(R(Z(),P(2,2)),R(P(2,1),C(S(),[P(4,4)]))) in

only 1.12452 seconds (Figure 5.7), what contrasts a lot with the time needed to perform the computation

in Experiment 5.1.8. By testing this result with other values, we have no reason to suspect that this is

not an adequate description for the product function: (2,9) returned 18, (6,4) returned 24 and (85,96)outputed 8160. Furthermore, we want to see what would happen if a larger inp list and if a inp list with

bigger values was given to the scientist. Thus, we provided [(5,0), (2,3), (4,3), (6,3)] as inp list and out

list [0,6,12,18], which resulted in the scientist returning the same alleged adequate description as in the

previous attempt in a computation that took 3.42313 seconds. Then we proceeded to give the scientist

[(7,12), (30,14), (126,73), (256,421)] as inp list and [84,420,9198,107776] as out list. With this data, the

scientist did not go beyond verifying R(S(),R(P(2,1),R(P(3,1),C(S(),[P(5,5)])))).

Experiment 5.2.4. The next function is the double function f ∶ N → N defined by f(x) = 2x. This time

we will start with inp and out lists different from those in Experiment 5.1.9. We then start with inp list

[(2, )] and out list [4]. This resulted in the output of the description C(S(),[S()]), which is obviously

not a description for the double function since it describes the successor of the successor of the input.

We then attempted with [(2, ), (3, )] as inp list and [4,6] as out list, which returned the description

R(Z(),C(C(S(),[S()]),[P(2,2)])) in 0.030945 seconds. This is a description that we know from

Experiment 5.1.9 describes the function in hand, to the best of our knowledge. We also tried to see if we

could find this description with only one element in each list and so we gave to the scientist the inp list

[(5, )] and the out list [10]. This also returned the correct description, but this time in 0.040974 seconds,

which is a time a little worse than the one of the computation that returned the same description but with

two elements in each inp and out list.

62

(a) Lists of inp and out (b) Description found and the time it took in the bot-tom

(c) Code obtained

Figure 5.7: Results for the second attempt for the product function searched by the second scientist

Experiment 5.2.5. Lets now see what happens regarding the function f ∶ N2 → N defined by the expres-

sion f(x, y) = (x + y) .− 1. First we provided as inp [(1,2)] and as out [2]; this returned the description

P(2,2), which is obviously not a result we wanted to obtain. We then enlarged inp to [(1,2), (2,1)] and

out to [2,2], to which the scientist outputed the description C(S(),[R(P(1,1),R(P(2,1),P(4,3)))]).

By testing this description, we saw that it is still not an adequate one for describing this function

since for input (4,10) it returned the value 5 instead of 13. We thus enlarged again our input lists to

[(1,2), (2,1), (2,4)] (inp) and to [2,2,5] (out). This computation took 0.071959 seconds and returned the

description R(R(Z(),P(2,1)),C(S(),[P(3,3)])) (Figure 5.8), which we believe is an adequate descrip-

tion for this function from Experiment 5.1.10 and from testing it for other values (for input (8,7) outputed

14 and for (4,9) it returned 12, as it was expected). We tried to see if it was possible to achieve a correct

description using only one element in each list, which we did with inp list [(13,6)] and out list [18] in

63

0.141918 seconds, taking more or less the double of the time as the search with inp [(1,2), (2,1), (2,4)]and out [2,2,5]. Moreover, we also wanted to see if there was any difference in providing bigger lists

or lists with pairs containing greater elements. To do so, we began providing the scientist the lists

[(2,6), (3,0), (4,2), (1,5)] and [7,2,5,5] as inp and out lists, respectively. The resulted in the scientist

returning the same supposed appropriate description as in the previous attempt in a computation that

took 0.215976 seconds. Next, we gave the scientist inp list [(16,24), (73,51), (127,245), (318,182)] and

out list [39,123,371,499], what resulted in the scientist returning the same description of the last attempt,

in 2.17263 seconds.


(c) Code obtained

Figure 5.8: Results for the third attempt for the function f(x, y) = (x + y) .− 1 searched by the secondscientist

Experiment 5.2.6. Our next experiment was made regarding the function f ∶ N2 → N defined by the ex-

pression f(x, y) = (x+y)x. First, we inserted as inp list [(3,1)] and out list [12], which resulted in the sci-

entist returning the description R(S(),R(P(2,2),R(P(3,1),C(S(),[P(5,5)])))) in a computation that

took 1.26955 seconds. We tested this result for input value (5,3); it should have returned 40 but instead

it returned 757, which means that this description is not an adequate one. We then tried to see what

happened with inp list [(3,1), (2,4)] and out list [12,12]; in this case the scientist took a lot of time in the

verification of the description C(S(),[R(S(),R(P(2,2),R(P(3,3),C(C(S(),[S()]),[P(5,5)]))))]).

This happens because the computation of the function described by this description needs to per-

form several nested for-loops whose number of iterations is very large, even for small input values

such as the ones given. To try to prevent this from happening we will change the inp list to con-

tain even smaller values and out to the list containing their correspondent outputs: our inp becomes

[(0,1), (1,0), (1,2), (2,1), (0,2), (2,0), (2,2)] and our out [0,1,3,6,0,4,8]. This way, we try to minimize

as much as we can the number of times those for-loops are executed, in order to prevent the scientist

to get stuck on descriptions like these. However, these inp and out lists still don’t let us perform the

search after description R(P(1,1),R(P(2,1),R(P(3,1),R(P(4,1),R(P(5,1),C(S(),[P(7,7)]))))));

this is due to the fact that for the first three input pairs the function described by this description can

compute their correspondent outputs which are the same as the ones in out list, but when we reach

64

the pair (2,1) the fact that there are at least 5 nested for-loops (one for each symbol R) will once

again increase a lot the time of computation. Another step we took to try to prevent this situation was

to sort the elements in our lists in different way, trying to force the scientist to make the comparisons

that we suspect will need to execute these loops the least number of times first. Thus, we proceeded

to provide as inp list [(0,1), (0,2), (1,0), (1,1), (1,2), (2,0), (2,1), (2,2)] while the out list inserted was

[0,0,1,2,3,4,6,8]. With this input, the scientist, after more than 24 hours of computation, still had not

found a description for a function that explains the relation between the elements of inp and the ones of

out, although not being stuck in the verification of any description. At this point, we made the decision of

terminating this computation.

Experiment 5.2.7. We will now see what happens when provide to the algorithm information that ex-

plains the distance function, defined by dist(x, y) = ∣x − y∣. We began by establishing as inp list [(4,3)]and as out list [1]. This led the scientist to return the description R(P(1,1),R(P(2,1),P(4,3))) in

0.003997 seconds. However, by testing this result we conclude that it is not a correct one since

for input (7,9) it returns 6 instead of 2, although it returns a correct output to input (15,9) (it re-

turned 6). We then changed our inp list to [(4,3), (2,6)] and our out list to [1,4]; it returned the

description C(R(Z(),P(2,1)),[R(S(),P(3,2))]), which when tested with input (23,14) returned the

wrong result of 12 instead of 9. This led us to conclude that this description was also not the cor-

rect one. Our next attempt was made with inp [(4,3), (2,6), (1,3)] and out [1,4,2], which returned

C(S(),[R(S(),R(R(P(1,1),P(3,2)),R(P(3,3),P(5,4))))]). By testing input (10,2), which should

have returned 8, the outcome was 6, and so we conclude that this description is not adequate. We

then tried with inp list [(4,3), (2,6), (1,3), (3,7)] and out list [1,4,2,4]. It outputed the same description

as the last attempt, C(S(),[R(S(),R(R(P(1,1),P(3,2)),R(P(3,3),P(5,4))))]). We then appended

the last test input to inp list and its correspondent correct output to out list, resulting in providing the

scientist the lists [(4,3), (2,6), (1,3), (3,7), (10,2)] and [1,4,2,4,8]; it did not went past the verifica-

tion of the description R(S(),R(P(2,1),R(P(3,1),C(C(S(),[S()]),[P(5,5)])))). At this point, we

made a drastic change in the inp and out lists provided: we adopted a similar reasoning as in the Ex-

periment 5.2.6 and inserted as inp the list [(0,1), (0,2), (1,0), (1,1), (1,2), (2,0), (2,1), (2,2)] and as

out the list [1,2,1,0,1,2,1,0]; with these lists as input, the description returned by the scientist was

C(R(S(),P(3,1)),[R(P(1,1),R(P(2,2),P(4,3))),P(2,1)]). When this result was tested with input

(2,6) the returned value was 5 which, once again, is the wrong one, since it should have been outputed

the number 4. That means that this description still isn’t the correct one. We then added this input test-

ing pair to the inp list, resulting in an inp list of [(0,1), (0,2), (1,0), (1,1), (1,2), (2,0), (2,1), (2,2), (2,6)],with respective out [1,2,1,0,1,2,1,0,4]. With this information, the scientist returned the description

R(P(1,1),C(R(S(),R(P(2,2),P(4,3))),[P(3,2),P(3,1)])), a description that when tested with in-

put (4,3) returned 3 instead of 1; once again, the scientist did not returned an appropriate descrip-

tion. Once again, we proceeded to add the previous test pair to inp list, providing to the scientist

[(0,1), (0,2), (1,0), (1,1), (1,2), (2,0), (2,1), (2,2), (2,6), (3,4)] as inp and [1,2,1,0,1,2,1,0,4,1] as out.

The outcome was R(P(1,1),C(R(S(),R(P(2,2),P(4,3))),[P(3,2),P(3,1)])). We tested it for inputs

(6,2) and (4,3) whose return was, respectively, 5 and 1 thus being correct for the second input pair but

65

not for the first, where the result should have been 4. We then proceeded into adding the pair (6,2) into

the inp list and its respective accurate output to out list, what resulted in a computation of over 24 hours

that, besides not returning a conjecture, it kept on testing and verifying different descriptions. At this

point, we terminated the procedure, without having any result to present.

Experiment 5.2.8. Our last experiment regarding this algorithm was based on the exponential function

defined as exp(x, y) = xy. We started by giving the scientist the lists [(2,3)] as inp and [8] as out. The

scientist quickly found the description R(P(1,1),C(C(S(),[S()]),[P(3,3)])) that explains the relation

between this values. However, by testing it for other inputs, like (3,3) and (1,6) we see that although it

outputs the correct result for the first one (9), the output of the second one, that should be 1, is 16 and so

we conclude that this description is not the correct one. We then proceeded to enlarge our information

lists: inp to [(1,3), (2,3)] and out to [1,8]. The search procedure did not advance through the descrip-

tion R(S(),R(P(2,1),R(P(3,1),C(R(Z(),P(2,2)),[P(5,4)])))). We then used the same technique

as in Experiment 5.2.6 and changed our inp to the list [(0,1), (0,2), (1,0), (1,1), (1,2), (2,0), (2,1), (2,2)]and our out to [0,0,1,1,1,1,2,4]. This had as result the return by the scientist of the description

C(R(Z(),R(S(),P(3,1))),[R(P(1,1),R(P(2,1),R(P(3,3),C(S(),[P(5,5)]))))]), which took a time

of 13770.8 seconds to compute (approximately 3 hours and 50 minutes). However, we can see that this

description is not the correct one by testing it with input (4,1): it should have outputed 4 but it returned 9.

We then tried to include this input pair in our inp list and its correspondent output in the out list; after 24

hours of computation, the scientist did not return any conjecture while not being stuck in the verification

of any description, and thus we aborted the search procedure.

We present a summary of all the experiments in Table A.2.

5.3 Third algorithm

Lastly, we see the results of the search procedure in Algorithm 12, which used the enumeration de-

scribed in Algorithm 13. Since we are now using the elementary functions as the scientist’s learning

environment and it is used a distinct definition of description (and consequently an entire different way

of listing the functions), we will proceed into performing experiments with the most basic functions again

(except for those that are in the basis of the construction of description in Definition 4.2.1: the addition,

the product, the subtraction and the division, since we know those will be described by EA(), ET(), EM()

and ED(), respectively).

Experiment 5.3.1. We began the study of this algorithm’s functioning with the predecessor function

defined by the expression pred(x) = x .− 1. We started by giving as inp the list [(0, )] and as out the

list [0]. Expectantly, it returned the description EP(1,1), regarding the identity function. We proceeded

with [(0, ), (1, )] to be inp and [0,0] to be out; this resulted in the scientist returning the description

EBS(EP(1,1)), which describes the sum of all the elements smaller than the one given as input. This

result fails to describe the function we want: when tested for for inputs (4, ) and (13, ) it returns 6 and

78, instead of 3 and 12. We then tried with lists [(0, ), (1, ), (2, )] and [0,0,1], which resulted in the

66

same description outputed. For the fourth attempt, we provided the list [(0, ), (1, ), (2, ), (3, )] as our inp

and [0,0,1,2] as our out. This computation, that can be seen in Figure 5.9, took 0.015645 seconds

and resulted in description EBS(EBS(EBP(EP(1,1)))), which when tested for other input values always

returned the right output. And so, we have confidence that the description found is an adequate one for

the predecessor function. We also tried to understand if there was any prejudice in providing more and/or

bigger values; to do so we first gave the scientist inp list [(1, ), (3, ), (5, ), (15, )] and out list [0,2,4,14],followed by an attempt performed with [(1, ), (3, ), (5, ), (15, ), (260, )] as inp list and [0,2,4,14,259] as

out list. The result in both cases was the same alleged adequate description for this function, which was

found in a time of 0.011996 seconds in the first case and in 2.74342 seconds in the second case.


(c) Code obtained

Figure 5.9: Results of the fourth attempt for the function pred(x) = x .− 1 searched by the third scientist

Experiment 5.3.2. Our next experiment was made regarding the function with expression zero(x) = 0.

We provided the inp list [(3, )] and the out list [0], which resulted in the description EBP(EP(1,1)). This

description is related to the function that performs the product of all the numbers smaller than the input

given; however, for input (0, ) the bounded product is not executed any time and, by definition, it outputs

1, which means that this description is not an adequate one for this function. With this in mind, we

added (0, ) to our inp list and 0 to the out list, resulting in [(0, ), (3, )] and [0,0]. To these inputs the

scientist returned the description EC(EBP(EP(1,1)),[ES()]) in 0.031248 seconds, which by performing

several tests for different inputs, it returned always the value 0. This makes sense since the description

describes the function that executes the product of every natural number smaller than the successor of

the input value; this means that this function will perform the product of the numbers smaller than the

successor of the input value, amongst which is always the value 0, and so the output of this function will

always be 0, an so we conclude that this description is an adequate one, to the best of our knowledge.

Experiment 5.3.3. Now, lets see what happens with the function with expression f(x) = 2x. Our first at-

tempt was made with inp [(1, )] and out [2]; that resulted in the description for the successor, ES(), which

is obviously not the correct one. We then tried with inp [(1, ), (2, )] and out [2,4], to which the scientist

67

returned the description EC(ES(),[EBS(ES())]). By testing this result with input (15, ) we observe that it

returns the wrong value: 16 instead of 10, and so we conclude that the description is not adequate. The

next attempt was made with the list [(1, ), (2, ), (3, )] as inp and [2,4,6] as out. The returned description,

in a computation that took 0.015624 seconds, was EC(EA(),[EP(1,1),EP(1,1)]) which describes the

function that sums the input value with itself; obviously this description is an adequate one, and so we

don’t need to make another attempt.

Experiment 5.3.4. We go on to the function defined by the expression f(x, y) = x + y .− 1. We begin

with inp list [(2,3)] and out list [4]; the result of this attempt is the description regarding the function

that performs the successor of the second element of the input pair: EC(ES(),[EP(2,2)]). This is

not the description pretended, and so we need to execute the search again. We thus extend our inp

list to [(2,3), (1,4)] and the out list to [4,4]. The description returned is EBS(EBP(EBP(EP(2,1)))),

which we tested for other values. For input pair (9,1) it returned 1 instead of 9, and so we con-

clude that this description is not the correct one. For the next attempt, we provided to the scientist

inp list [(2,3), (1,4), (3,1)] and out list [4,4,3] what returned EC(EBS(EBS(EBP(EP(1,1)))),[EA()]) in

0.093745 seconds. By testing this with other values we get the belief that this description is an adequate

one, since, for example, for input (14,3) it outputs 16 and for input (23,14) it returns 36.

Experiment 5.3.5. The next experiment was made regarding the function defined by the expression

f(x, y) = (x + y)x. Our first attempt was made by providing the scientist the lists inp [(1,3)] and out

[4], which returned the description of the addition function EA(). We then advanced to another attempt

with inp list [(1,3), (2,5)] and out list [4,14]; the returned description was EC(ET(),[EA(),EP(2,1)]) in

a time of 0.031219 seconds (Figure 5.10), which can easily be confirmed to be an adequate descrip-

tion for this function, since it describes the product of the sum of the two elements of the pair with the

first one, which is exactly the behaviour of this function. This conclusion was corroborated by perform-

ing the test for input values (14,3) and (7,3), which outputed 238 and 70, respectively. Furthermore,

we performed another two attempts, with more and greater input values in order to compare results:

with [(3,1), (4,1), (0,6), (2,4)] as inp list and [12,20,0,12] as out list the returned description was the

same as the one returned in the last attempt in 0.007997 seconds, just like it happened with inp list

[(2,4), (10,2), (3,15), (20,98), (50,120)] and out list [12,120,54,2360,8500] the scientist also returned

the same description as the last attempt in 0.016994 seconds.


(c) Code obtained

Figure 5.10: Results for the second attempt for the function f(x, y) = (x + y)x searched by the thirdscientist

68

Experiment 5.3.6. We will now try to see the scientist’s behaviour regarding the square function defined

as sq(x) = x2. At the beginning we defined inp list as [(2, )] and out list as [4], which resulted in the

output of the description EC(ES(),[ES()]) that describes the successor of the successor function. This

is clearly not the description we want to find, and so we proceeded to a second attempt. To do so, we

enlarged inp to [(2, ), (3, )] and out to [4,9]. For these input lists, the scientist returned in 0.046871

seconds the description EC(ET(),[EP(1,1),EP(1,1)]), which describes the product of the input value

with itself; this is the definition of the squared function, and so we are in conditions to conclude that this

description is a correct one.

Experiment 5.3.7. Next, we tried with the exponential function defined as exp(x, y) = xy. For the

first attempt, inp list was [(2,3)] and out list [8]; this resulted in the scientist returning the description

EBP(EP(2,1)) in 0.005502 seconds. The test with input pair (3,2) outputed 9 and the one with input

(5,3) returned 125, which are the adequate results for these inputs. Furthemore, this is the way we

defined the exponential function in Chapter 4 and so we concluded that this is an adequate description

for this function. Moreover, it was performed another attempt with more input values, some of them with

with great value. With inp list [(3,5), (9,3), (15,2), (20,6)] and out list [243,729,225,64000000], the result

was the same adequate description as before in 0.006994 seconds.

Experiment 5.3.8. Advancing to the factorial function, we began by defining the inp list as [(2, )] and

the out list as [2]; obviously this concluded in the scientist returning EP(1,1) which is not the description

we want to find. Then, we attempted with the lists [(2, ), (3, )] and [2,6]. The scientist outputed the

description EBP(ES()) in 0.015621 seconds, which we tested for inputs (6, ) and (8, ); the results of

these tests were 720 and 40320, which are correct. Besides, this description relates to the product of the

successor of every number smaller than the input value, which is the definition of the factorial function,

and so we are in conditions to conclude that this description is an adequate one for the function in hand.

Experiment 5.3.9. Regarding the binary max function, that given a pair of elements outputs the biggest

one between them, we started by providing the scientist with inp list [(3,4)] and out list [4]. It returned

the description for the projection of the second element of the pair EP(2,2), which is obviously not the

intended outcome. We then proceeded to append to inp list the pair (5,2) and to out list its correspon-

dent value 5. This resulted in the scientist returning the description EC(EA(),[EM(), EP(2,2)]) in a

computation that took 0.015625 seconds. This is an expression that describes the addition of the sub-

traction of the two elements of the input pair with the second one. If we go to Chapter 4, we see that this

is one of the ways to define the max function, and so we deduce that this description is a correct one for

this function, which is corroborated by the following tests: for input (45,20) it outputed 45 and for (4,10)the result was 10. We still performed another attempt with an inp list with more and greater values. Thus

we provided to the scientist inp list [(15,20), (136,59), (420,767), (520,10)] and out list [20,136,767,520],what resulted in the scientist returning the same appropriate description as before in 0.021094 seconds.

Experiment 5.3.10. For the binary min function, that given a pair of elements outputs the smallest one

between them, we began with inp list [(3,4)] and out list [3]; expectantly it returned the description for the

first element of the pair EP(2,1). By having as inp list [(3,4), (5,2)] and [3,2] as out list, we obtained, in

69

0.007996 seconds, the description EC(EM(),[EP(2,1),EM()]) which describes the subtraction between

the first element of the pair and the subtraction of both elements of the input pair. If we go to the

expression for the min function in Chapter 4, we see that this is exactly the expression used to define

this function, and so, with the help of the tests with input pairs (60,2) and (6,14), which returned 2 and

6 respectively, we conclude that the scientist found an appropriate description for this function.

Experiment 5.3.11. We proceed to the sg function as defined in Chapter 4. We start with inp list [(0, )]and out list [0], which obviously led the scientist to output the description for the unary projection function

EP(1,1), as well as it did when we provided inp list [(0, ), (1, )] and out list [0,1]. However, when the

scientist receives the lists [(0, ), (1, ), (2, )] and [0,1,1], it returns EBS(EBP(EP(1,1))) in virtually no time

(0.0 seconds), regarding which, when we test it for other values, we believe that it is a valid description

for this function, since it outputed 1 for every provided input different from 0. We then tried to see if there

was any computational prejudice by providing lists with more elements to the scientist: in this case, only

when we added to inp list big numbers (for example, 500) could we see differences in the computational

time, even if residual: it increased from 0.0 seconds only for 0.235158 seconds. We then saw what

happened for even bigger numbers, like 5000; it took 62.4937 seconds to terminate.

Experiment 5.3.12. The next function that we experimented with is the sg defined by the expression

sg(x) = 1.−sg(x). With inp list [(0, )] and out list [1], the scientist expectantly returned the description for

the successor ES(), which does not describe this function. When the inp and out lists were augmented

to [(0, ), (1, )] and [1,0], respectively, the returned description was EBP(EP(1,1)), which when tested for

other inputs always returned 0, except for input (0, ) for whom it returned 1, and so the scientist returned,

to the best of our knowledge, an adequate description for this function. This computation took 0.00997

seconds.

Experiment 5.3.13. We will now test the scientist for the distance function, which we were not able to

identify through the previous scientists (one of which we did not even try to do so). Our first attempt

was performed using inp list [(3,2)] and out list [1], which returned the description EM(); obviously this

expression does not describe the distance function since it is defined as the one for the subtraction

function. We then increased both lists to [(3,2), (1,6)] and [1,5]. This time, the expression given by the

scientist was EBS(EBS(EBP(ET()))), which we then tested for a few input pairs: for (2,1) it outputed 0

instead of 1, for (2,5) the result was 4 instead of 3 and with input (85,23) it returned 22 when it should

have returned 62. Next, we appended one of these pairs to the inp list, (2,5), and its respective output

3 to out list. This time, the search procedure did not went through the verification of the expression

EC(EBS(EBP(EBS(ES()))),[EBP(EA())]). The same happened for inp list [(3,2), (1,6), (2,1)] and out

list [1,5,1]. We suspected that changing the order of the pairs in inp list could result in a different

outcome, since the verification is performed following the order of the pairs in inp list, and so we tried

again with the same pairs but this time our inp list was [(3,2), (2,1), (1,6)] while our out list was [1,1,5];it made no change in the outcome. At this time, we used the strategy that at this point is no news:

our inp list became [(0,1), (0,2), (1,0), (1,1), (1,2), (2,0), (2,1), (2,2)] and the out list to the respective

correct output values, [1,2,1,0,1,2,1,0]. This time, the scientist needed 3.51511 seconds to return the

70

description EC(EA(),[EM(),EC(EM(),[EP(2,2),EP(2,1)])]), to which we proceeded to perform some

tests: to input (85,23) we now have the correct output of 62, for pair (12,7) the result was 5 as it should be

and when provided with (43,15) it resulted in the correct value of 28. In fact, by analyzing the expression

itself, we observe that this describes the addition operation of the subtraction of the first element of the

pair with the second with the result of the subtraction of the second element of the pair by the first, which

is exactly the expression we used to define this function back in Section 3.2, in Example 3.2.6, and thus

we have every reason to conclude that this description is a correct one for this function.

Experiment 5.3.14. Our last experiment regarding the behaviour of the scientist with simple arithmetic

functions will be performed with the function that performs the natural division of a number by 2, i.e. the

function defined by the expression f(x) = ⌊x2⌋. We began with inp list [(1, )] and out list 0. The result

here was the expression that describes the predecessor function, as expected. We then appended

values to the lists that would contradict this result, having now [(1, ), (3, )] as inp list and [0,1] as out list.

The scientist then returned the expression EBS(EBS(EP(1,1))), which when tested for input values (4, )and (5, ) wrongly returned 4 and 10, respectively, instead of the correct result of 2 for both cases. This

made us provide the lists [(1, ), (3, ), (4, )] and [0,1,2], which made the scientist return the expression

EC(ED(),[ES(), EBS(ES())]) in a computation that took 0.009613 seconds to complete. The tests

performed were the following: for input (6, ) the result was 3, when we provided (8, ) the scientist returned

4, with input (15, ) the output was 7 and when given (253, ) the scientist returned 126; all these results

are correct and so we have a strong suspicion of having found an adequate description for this function.

Since this was the scientist with the most promising results, we then proceeded to try it for what we

proposed ourselves to do: see if it can find expressions that describes natural laws.

Experiment 5.3.15. Lets suppose we are trying to find out the relation between the measurements we

performed of the electrical resistance, the current intensity and the potential difference in a section of an

electric circuit. We know that the resistance has 3Ω and it was measured the following pairs of values

for the current and the voltage: (1,3), (3,9), (4,12) and (6,18). Suppose we want to write the voltage in

order to the current (since the resistance is constant, for now we will not worry about giving its value to

the scientist); then we provide to the scientist the lists [(1, ), (3, ), (4, ), (6, )] as the inp list and [3,9,12,18]as the out list. The expression returned was EC(EC(EA(),[EA(),EP(2,1)]),[EP(1,1),EP(1,1)]) in a

computation that took 0.078901 seconds. On the other hand, if we try to write the current in order to the

voltage, the inp list would become [(3, ), (9, ), (12, ), (18, )] and the out list would be [1,3,4,6]; with these

lists, the scientist did not went beyond verifying the expression EC(EBS(EBP(EP(1,1))),[EBP(ES())]).

To try to overcome this problem, we added the resistance value to the pairs in the input list, in order

to facilitate the search. With this in mind, the inp list was [(3,3), (9,3), (12,3), (18,3)] and the out list

was [1,3,4,6], this resulted in a fast computation of 0.004992 seconds that returned the description

EC(ED(),[EP(2,2), EP(2,1)]), as it can be seen in Figure 5.3.15. So, to the best of our knowledge

since there is no more information, the scientist was able to find an expression for the relation between

the current, the voltage and the resistance between two points in an electric circuit. In fact, the second

expression the scientist found is the one that describes Ohm’s Law in the way we are used to see it:

71

I(V,R) = VR

.

(a) Lists of inp and out (b) Description found and the time it took in thebottom

(c) Code obtained

Figure 5.11: Results regarding Ohm’s Law searched by the third scientist

Experiment 5.3.16. We then tried to see if the scientist could find out the relation between the distance

of a planet to the sun and the period of its orbit (i.e. see if it can find the expression that describes

Kepler’s Law). To do so, we took the values present in [25]: our inp list is [(1, ), (4, ), (9, )] and our

out list is [1,8,27]. This resulted in the scientist not going beyond the verification of the expression

EC(EC(EBP(ES()),[EBP(ES())]),[EBS(ES())]); without being able to provide more data (for smaller

input numbers, preferably), there no more we can do in this case.

Experiment 5.3.17. Our next experiment was regarding the law of gravitation that relates the masses

of two bodies, the distance and the gravitational force that is in action between then, which is ex-

pressed by the formula F = Gm1m2

d2. Let us suppose that we are obtaining data from the bodies with

masses of 2 × 105kg and 9 × 105kg, respectively. Since we know a priori that the value for the grav-

itational constant G is 6.674 × 10−11m3kg−1s−2, for this masses the product Gm1m2 has the approxi-

mate value of 12, which would make the pairs (d,F ) as (1,12) and (2,3) for example. Thus, we pro-

vide the scientist inp list [(1, ), (2, )] and out list [12,3], what resulted in the return of the expression

EC(EC(EBS(EA()),[EBS(ES()),ES()]),[EC(ED(),[EP(1,1),ES()])]) in 473.198 seconds. We know

that, in this case, if this description is a correct one for this function, then for input (3, ) the result should

be 1 for the integer division of 12 by 33 = 9 is 1. Thus, the test performed with input (3, ) showed that

this description is not adequate, since the respective output was 3. Next, we tried to understand if with

this new pair of (d,F ) values, (3,1), the scientist would return a more accurate conjecture. In fact, this

time the scientist outputed EC(EBP(EBP(EA())),[ES(),EC(ED(),[EP(1,1),EC(ES(),[ES()])])]) in a

computation that took 1827.86 seconds. However, by testing for input (4, ) the output returned was 1

when it should have been 0 (once again due to performing the integer division of 12 by 16). We then

added this values to our lists, which resulted in the scientist not going beyond the verification of the

description EC(EC(EBS(ED()),[EP(1,1), EBP(EBP(ES()))]),[EC(ET(),[ES(), ES()])]). After this

result, we decided to not perform any more attempts for discovering an expression that describes this

function.

We present a summary of all the experiments in Table A.3.

72

5.4 Analysis

By observing the results obtained, we can retain that the time efficiency of the algorithms improves

with the upgrades we performed, be them the changes made in the listing of the primitive recursive

functions or the change in paradigm from searching among the primitive recursive functions to restricting

that search to the set of elementary functions. Furthermore, these modifications also allowed each

scientist to find a greater number of more complex functions than the previous ones. However, this

came with some setbacks since the fact that the improvements that allowed the scientists to find more

complex descriptions also sometimes prevented them to proceed with the search. This happens due

to the existence of expressions that describe hard to compute functions, which would make the search

algorithm stuck in the computation of the application of those functions to some input values, not allowing

the scientist to compare the actual result with the expected one. This happens because the concerned

functions are computed with nested for-loops that had to be executed a tremendous number of times,

for example. The fact that the first algorithm did not suffer from this harm is because the descriptions we

could reach with it were not of this hard to compute nature.

Looking at the great picture of the several experiments performed, we observe that the computational

times are generally small and the number of points needed for finding a description is reduced, i.e. we

have small locking sequences for the majority of our experiments. We also understood that the size for

these locking sequences depends not only on the concerned function but on the values given as input:

if the function that explained the relation between the values in inp and out lists could be described by

an expression present in the early stages of the descriptions’ listing, then a very small locking sequence

was obtained; if a function could only be described with an expression that appears at a more posterior

position in the listing of descriptions, a small locking sequence could only be achieved if the values

provided were not also explained by a description that appeared before in the enumeration but also if

these values were not big enough to cause the scientist to be stuck at the verification of some hard to

compute descriptions. These values sometimes were not that big: for example in the second Algorithm

inp list was composed with the pairs (1,3) and (2,3) and that was enough for the procedure not to go

beyond the verification of the description R(S(),R(P(2,1),R(P(3,1),C(R(Z(),P(2,2)),[P(5,4)])))).

Another pivotal point to understand the efficiency of the scientists is to observe that the order of the

elements in the inp list also had influence on the outcome of the search; if the tuples that are big enough

to cause the scientist to get stuck in the verification of a description are preceded by tuples that are easy

to compute and whose result is different than the respective ones in the out list, then the scientist will

overcome the verification of that description and move on through the search. Trying to force this input

tuples to appear first in the inp list can be done only with trial and error, since predict the outcome of a

situation like this is practically speculation and close to impossible due to the fact that it heavily depends

on the function we are trying to find.

On the other hand, the length of the lists provided is not that important to the efficiency of the

scientists as one could assume at the beginning, at least compared to the other factors we commented

previously. Bigger inp and out lists did in fact slow down the search but mostly because the values

73

provided were big enough to slow down the computation of the verification of some descriptions. We see

that when the provided values’ outputs are easily computed by the several possible descriptions listed

or if the initial tuples fail the verification right ahead, preventing the same computation and verification of

the following values, the increase in the time of computation, although existing, is residual.

This makes the efficiency of each scientist mainly depending on three factors: the position of the

adequate description on the enumeration lists, the magnitude of the values provided and the order in

which these elements are given, whilst the size of the lists (i.e. the number of points given) proved to

be less relevant to this question: as long as the values are small enough to allow the computation to

proceed, the time needed to find the description is not going to be significantly bigger than for smaller

but proper lists.

Regarding the code of the generated programs, we observed that sometimes they were constructed

with the use of several nested for-loops. If we recall a statement made in the beginning of Chapter 3,

we saw that the elementary functions were the ones who had a program that only needed a maximum

of two nested for-loops in its sequence of instructions. However, for some of the elementary functions

found, the resulting code had more than two nested for-loops, especially when performing the search

with the scientists implemented upon the primitive recursive functions. This does not mean that the

statement is wrong; it just means that the first description found that explained these functions was one

that generated a program of this sort. It is possible that if we kept searching we would find a description

located in a posterior position of the enumeration that would also explain the function in question and

whose constructed program would only have a maximum of two nested for-loops.

74

Chapter 6

Conclusions

6.1 Achievements

The major achievement of this work was the computational development of scientists that were able to

identify simple primitive recursive functions and/or elementary functions in a short amount of time. These

scientists not only identified these functions but they did so with very little information provided, resulting

in the discovery of very small locking sequences: for example, if we compare with the results obtained

in a similar work for finite automata in [29], the difference is astonishing since it was needed to provided

a great number of points (sometimes more than 30) for relatively simple automata to be identified1. A

possible explanation for this phenomenon is that, opposite to what happens in the automata case, the

primitive recursive functions have an underlying structure that, even for a small set of input tuples, cause

the respective outputs for different primitive recursive functions to be very distinct among themselves,

which makes it easier to find the correct functions. To better understand the behaviour of the scientists,

see the analysis of the experiments’ results in Section 5.4.

Regarding the relation between these results and the search for empirical laws, there are some as-

sumptions we need to do before we can draw any conclusions. First, we assume that the observations

measured are natural values, which is not true. However, this allows us to focus on the identification of

these laws through their form, i.e. their algebraic expressions, translated into descriptions of recursive

functions of the type N→ N. This comes with a cost, since we are ignoring the existence of experimental

errors, especially through the verification process of an hypothesis. We also assume that the expres-

sions that explain these laws have a certain underlying homogeneity to themselves, i.e. they are mainly

explained, for example, by continuous not piecewise functions. This was why we did not experimented

and tested the scientists with piecewise functions: the fact that they need more than one mathematical

expression to define it would make its identification much more complex and long, since they would be

explained a much longer description.2 It is this belief in the nature of the empirical laws that allow us to

assume that we are able to explain every natural law not only with functions in the set of the primitive1Point out that this was a much more simple work without the dimension and complexity of a dissertation. The full source code

of this project can be found in https://github.com/gamatos/gold.2To be precise, we did experiment with the functions sg and sg. However, they can also be easily expressed by a single and

unified algebraic expression.

75

https://github.com/gamatos/gold

recursive functions but even more narrowly only by functions in the class of elementary functions. If

this assumption is true, then the developed scientists can be considered to be “embryos” of scientists

that are actually able to identify relations of natural phenomena on their one, facilitating the scientific

discovery process that many times does not evolve because those relations are not found.

6.2 Future Work

In order for our “embryo” scientists to evolve to ones that can actually discover expressions that explain

the empirical laws, there are some improvements that can be performed, like the ones that follow:

• Improve the interface of the scientist, turning it more complex and user friendly.

• Execute these experiences on a computer with a greater processing capacity. This would allow not

only smaller computational times but also for the scientists to go further in their search, due to the

fact that a better processor would endow each scientist with a greater capability of not getting stuck

in the verification of the functions, since it would compute the so called hard-to-compute for-loops

much more efficiently. This way, we could even provide more points to the scientist without fear of

the search being blocked in the verification of a description, thus increasing the chances of finding

a description that would explain the relation between the input and output values.

• Reduce even more the redundancies in the enumeration of the descriptions. For example, the

descriptions C(P(2,1),[P(2,1),D]) and C(P(2,1),[P(2,1),F]), where D and F are binary de-

scriptions, are expressions that describe the same relations between input tuples and output val-

ues, and so they do not need to be both considered and tested. Furthermore, in this case you

do not even have to consider any of these descriptions to test if they explain the relation between

the inputs and the outputs, since they both are redundancies of a much more simple description,

P(2,1). If descriptions like those two are not considered in the enumeration, the search would

become much more efficient, especially if these descriptions are of the ones that imply a great

number of executions of for-loops.

• Define more functions to be in the basis of the rules that inductively construct the descriptions, like

it was done with the natural quotient function for the definition of descriptions for elementary func-

tions, defined with the symbol ED(). A great improvement in this matter would be the addition of

constants to this basis; this way, we could write natural numbers with size one descriptions instead

of needing to write those constants with nested compositions of the successor operation applied

to the zero constant, thus having expressions with smaller size that would describe functions with

natural numbers in their expressions.

• Change the verification step to take into account the existence of experimental errors. This can be

done by changing the notion of convergence to the one in Definition 2.2.8.

76

Bibliography

[1] J. Avigad. Notes on Recursive Functions. Unpublished. Revised and expanded by Zach, R.

[2] J. L. Bell and M. Machover. A Course in Mathematical Logic. Elsevier, 1977.

[3] E. Bilsland, L. Van Vliet, K. Williams, J. Feltham, M. P. Carrasco, W. L. Fotoran, E. F. Cubillos,

G. Wunderlich, M. Grøtli, F. Hollfelder, et al. Plasmodium dihydrofolate reductase is a second

enzyme target for the antimalarial action of triclosan. Scientific reports, 8(1):1038, 2018.

[4] L. Blum and M. Blum. Toward a mathematical theory of inductive inference. Information and Control,

28:125–155, 1975.

[5] J. Case. Infinitary self-reference in learning theory. Journal of Experimental & Theoretical Artificial

Intelligence, 6(1):3–16, 1994.

[6] J. Case. Algorithmic scientific inference. International Journal of Unconventional Computing, 8(3),

2012.

[7] J. Case and C. Smith. Anomaly hierarchies of mechanized inductive inference. In R. J. Lipton,

W. A. Burkhard, W. J. Savitch, E. P. Friedman, and A. V. Aho, editors, Proceedings of the 10th

Annual ACM Symposium on Theory of Computing, May 1-3, 1978, pages 314–319. ACM, San

Diego, California, USA, 1978.

[8] J. Case and C. Smith. Comparison of identification criteria for machine inductive inference. Theo-

retical Computer Science, 25(2):193–220, 1983.

[9] J. F. Costa. Unity of science as seen through the universal computer. IJUC, 13(1):59–81, 2017.

[10] J. F. Costa. On Discovering Scientific Laws. IJUC, 14(3–4):285–318, 2019.

[11] J. F. Costa and P. Gouveia. Computabilidade, Inferencia Indutiva, Complexidade. Draft of a book

to be submitted.

[12] N. Cutland. Computability: An introduction to recursive function theory. Cambridge university press,

1980.

[13] W. Ewert. (https://.stackexchange.com/users/1343/winston-ewert). Enumerating

the primitive recursive functions. Software Engineering Stack Exchange. URL:

https://softwareengineering.stackexchange.com/a/310061 (version: 2016-02-13).

77

[14] M. Gladstone. A reduction of the recursion scheme. The Journal of Symbolic Logic, 32(4):505–508,

1968.

[15] M. Gladstone. Simplifications of the recursion scheme. The Journal of Symbolic Logic, 36(4):653–

665, 1971.

[16] E. M. Gold. Language identification in the limit. Information and control, 10(5):447–474, 1967.

[17] K. Gurney. An Introduction to Neural Networks. Taylor & Francis, Inc., Bristol, PA, USA, 1997.

[18] W. G. Handley and S. S. Wainer. Complexity of primitive recursion. In U. Berger and H. Schwicht-

enberg, editors, Computational Logic, pages 273–300, Berlin, Heidelberg, 1999. Springer Berlin

Heidelberg.

[19] S. Jain, D. N. Osherson, J. S. Royer, and A. Sharma. Systems That Learn. An Introduction to

Learning Theory. The MIT Press, second edition, 1999.

[20] S. Kahrs. The primitive recursive functions are recursively enumerable. University of Kent at Can-

terbury, Department of Computer Science X, 200, 01 2008.

[21] K. T. Kelly. The Logic of Reliable Inquiry. OUP USA, 1996.

[22] J. G. Kemeny. A Philosopher Looks at Science. Van Nostrand, 1959.

[23] R. D. King, J. Rowland, S. G. Oliver, M. Young, W. Aubrey, E. Byrne, M. Liakata, M. Markham, P. Pir,

L. N. Soldatova, et al. The automation of science. Science, 324(5923):85–89, 2009.

[24] H. Kitano. Artificial intelligence to win the nobel prize and beyond: Creating the engine for scientific

discovery. AI magazine, 37(1):39–49, 2016.

[25] P. Langley, H. A. Simon, G. L. Bradshaw, and J. M. Zytkow. Scientific Discovery: Computational

Explorations of the Creative Process. MIT Press, Cambridge, MA, USA, 1987.

[26] S. Liu. An enumeration of the primitive recursive functions without repetition. Tohoku Math. J. (2),

12(3):400–402, 1960.

[27] M. Lobao. Identifying Empirical Laws. Master’s thesis, Instituto Superior Tecnico, 2016.

[28] E. Martin and D. N. Osherson. Elements of Scientific Inquiry. MIT Press, 1998.

[29] G. Matos. An Exhaustive Algorithm for Minimum State Automaton Identification. Graduation Project,

Instituto Superior Tecnico, 2019.

[30] A. R. Meyer and D. M. Ritchie. The complexity of loop programs. In Proceedings of the 1967 22nd

national conference, pages 465–469. ACM, 1967.

[31] P. G. Odifreddi. Classical Recursion Theory: Volume II, volume 143 of Studies in Logic and The

Foundations of Mathematics. Elsevier Science B.V., 1999.

78

[32] D. N. Osherson, M. Stob, and S. Weinstein. Systems That Learn: An Introduction to Learning

Theory for Cognitive and Computer Scientists. The MIT Press, 2nd edition, 1986.

[33] R. Reis. Automatos finitos: manipulacao, geracao e contagem. PhD thesis, Faculdade de Ciencias

da Universidade do Porto, 2007.

[34] H. E. Rose. Subrecursion: Functions and Hierarchies. Clarendon Press, Oxford, 1984.

[35] A. Sernadas, M. C. S. Sernadas, and J. Ramos. Computability and Complexity: A Mathematical

Primer. College Publications, 2018.

[36] M. C. S. Sernadas. Introducao a Teoria da Computacao. Editorial Presenca, 1993.

[37] A. Sparkes, W. Aubrey, E. Byrne, A. Clare, M. Khan, M. Liakata, M. Magdalena, J. Rowland,

L. Soldatova, K. Whelan, M. Young, and R. King. Toward robot scientists for autonomous scientific

discovery. volume 2, 01 2010.

[38] M. P. Szudzik. The computable universe hypothesis. In A Computable Universe: Understanding

and Exploring Nature as Computation, pages 479–523. World Scientific, 2013.

79

Appendix A

Functions tested and summarized

results

A.1 List of functions tested with the scientists

• id(x) = x

• s(x) = x + 1

• pred(x) = x .− 1

• zero(x) = 0

• sx(x, y) = x + 1

• add(x, y) = x + y

• sub(x, y) = x .− y

• prod(x, y) = x × y

• f(x) = 2x

• f(x, y) = (x + y) .− 1

• f(x, y) = (x + y)x

• dist(x, y) = ∣x − y∣

• exp(x, y) = xy

• sq(x) = x2

• fact(x) = x!

• max(x, y) = (x .− y) + y

81

• min(x, y) = x .− (x .− y)

• sg(x) = x .− (x .− 1)

• sg(x) = 1.− sg(x)

• half(x) = ⌊x2⌋

• Ohm’s Law: I = VR

• Kepler’s Law: k = D3

P 2

• Gravitational Law: F = Gm1m2

d2

A.2 Summarized results

We present three tables with the results of the tests performed by each scientist, respectively.

82

Func

tion

inp

List

out

List

Des

crip

tion

Ade

quat

eD

escr

iptio

nTi

me

(sec

)id

(x)=

x[(

1,)

][1

]P(1,1)

Yes

0.0

03996

s(x)=

x+

1[(

1,)

][2

]S()

Yes

0.0

00996

pred(x

)=x

. −1

[(1,)

][0

]R(Z(),P(2,1))

Yes

0.0

04997

zero(x)=

0[(

2,)

][0

]R(Z(),P(2,2))

Yes

0.0

03999

s x(x,y

)=x+

1[(

2,4)

][3

]C(S(),[P(2,1)])

Yes

0.0

01956

add(x,y

)=x+y

[(2,

3)]

[5]

C(S(),[C(S(),[P(2,2)])])

No

0.0

04942

add(x,y

)=x+y

[(2,

3),(

1,5)

][5,6

]R(P(1,1),C(S(),[P(3,3)]))

Yes

0.0

12939

add(x,y

)=x+y

[(1,4

),(2,1

),(3,2

),(0,6

)][5,3,5,6

]R(P(1,1),C(S(),[P(3,3)]))

Yes

0.0

15591

add(x,y

)=x+y

[(13,2

4),

(35,4

1),

(133,2

56),

(420,5

13)]

[37,

76,3

89,9

33]

R(P(1,1),C(S(),[P(3,3)]))

Yes

0.0

67953

sub(x,y

)=x

. −y

[(5,

2)]

[3]

C(S(),[P(2,2)])

No

0.0

01001

sub(x,y

)=x

. −y

[(5,

2),(

2,0)

][3,2

]R(P(1,1),C(S(),[C(S(),[P(3,2)])]))

No

0.4

41712

sub(x,y

)=x

. −y

[(5,

2),(

2,0),(

4,1)]

[3,2,3

]R(P(1,1),C(R(Z(),P(2,1)),[P(3,3)]))

Yes

0.4

32695

sub(x,y

)=x

. −y

[(5,

2),(

4,1)

][3,3

]R(P(1,1),C(R(Z(),P(2,1)),[P(3,3)]))

Yes

0.4

95662

sub(x,y

)=x

. −y

[(34,1

2),(

25,4

0),(

151,7

2),(

627,7

28)]

[22,0,7

9,0

]R(P(1,1),C(R(Z(),P(2,1)),[P(3,3)]))

Yes

4.57001

prod(x,y

)=x×y

[(2,

3)]

[6]

R(S(),C(S(),[P(3,3)]))

No

0.0

14799

prod(x,y

)=x×y

[(2,

3),(

5,2)

][6,1

0]

R(R(Z(),P(2,2)),C(R(P(1,1),C(S(),[P(3,3)])),[P(3,1),P(3,3)]))

Yes

8495.

14

f(x

)=2x

[(0,)

][0

]P(1,1)

No

0.0

07997

f(x

)=2x

[(0,

),(1,)

][0,2

]R(Z(),C(S(),[C(S(),[P(2,1)])])))

No

0.6

12423

f(x

)=2x

[(0,

),(1,),(

2,)

][0,2,4

]R(Z(),C(S(),[C(S(),[P(2,2)])])))

Yes

0.6

58629

f(x

)=2x

[(3,)

][6

]C(S(),[C(S(),[S()])])

No

0.0

14991

f(x

)=2x

[(6,)

][1

2]

R(Z(),C(S(),[C(S(),[P(2,2)])])))

Yes

0.7

30134

f(x,y

)=(x

+y)

. −1

[(1,

2)]

[2]

P(2,2)

No

0.0

01000

f(x,y

)=(x

+y)

. −1

[(1,

2),(

2,1)

][2,2

]R(R(Z(),P(2,1)),C(S(),[P(3,3)]))

Yes

0.5

95186

f(x,y

)=(x

+y)

. −1

[(19,8

)][2

6]

R(R(Z(),P(2,1)),C(S(),[P(3,3)]))

Yes

0.8

25140

f(x,y

)=(x

+y)

. −1

[(2,0

),(5,4

),(2,2

),(1,3

)][1,8,3,3

]R(R(Z(),P(2,1)),C(S(),[P(3,3)]))

Yes

0.61178

f(x,y

)=(x

+y)

. −1

[(25,3

1),

(48,5

7),

(237,1

92),

(540,3

71)]

[55,1

04,4

28,9

10]

R(R(Z(),P(2,1)),C(S(),[P(3,3)]))

Yes

2.78675

Tabl

eA

.1:

Sum

mar

yof

the

expe

rimen

tsm

ade

byth

esc

ient

istr

elat

edto

the

first

algo

rithm

83

Functioninp

Listout

ListD

escriptionFound

Adequate

Time

(sec)add(x

,y)=x+y

[(2,3)][5]

C(C(S(),[S()]),[P(2,2)])

No

0.003998

add(x

,y)=x+y

[(2,3),(1,5)][5,6]

R(P(1,1),C(S(),[P(3,3)]))

Yes0.002997

sub(x

,y)=x

.−y

[(5,2)][3]

C(S(),[P(2,2)])

No

0.005182

sub(x

,y)=x

.−y

[(5,2),(2,0)][3,2]

R(P(1,1),R(P(2,1),P(4,3)))

No

0.031130

sub(x

,y)=x

.−y

[(5,2),(2,0),(4,7)][3,2,0]

R(P(1,1),C(R(Z(),P(2,1)),[P(3,3)]))

Yes0.041919

sub(x

,y)=x

.−y

[(5,2),(4,7)][3,0]

R(P(1,1),C(R(Z(),P(2,1)),[P(3,3)]))

Yes0.042063

sub(x

,y)=x

.−y

[(2,1),(3,6),(4,0),(5,2)][1,0,4,3]

R(P(1,1),C(R(Z(),P(2,1)),[P(3,3)]))

Yes0.031919

sub(x

,y)=x

.−y

[(20,1

0),(15,7),(34,57),(60

,61)][10

,8,0,0]R(P(1,1),C(R(Z(),P(2,1)),[P(3,3)]))

Yes0.578816

prod(x

,y)=x×y

[(2,3)][6]

R(P(1,1),C(R(Z(),P(2,1)),[P(3,3)]))

No

0.005001

prod(x

,y)=x×y

[(2,3),(5,2)][6,10]

R(R(Z(),P(2,2)),R(P(2,1),C(S(),[P(4,4)])))

Yes1.124

52

prod(x

,y)=x×y

[(5,0),(2,3),(4,3),(6,3)][0,6,12

,18]R(R(Z(),P(2,2)),R(P(2,1),C(S(),[P(4,4)])))

Yes3.423

13

prod(x

,y)=x×y

[(7,12),(30

,14),(126,73),(2

56,42

1)][8

4,420,91

98,107776]stuck

inR(S(),R(P(2,1),R(P(3,1),C(S(),[P(5,5)]))))

——

f(x)=2x

[(2,)]

[4]C(S(),[S()])

No

0.011992

f(x)=2x

[(2,),(3,)][4,6]

R(Z(),C(S(),[C(S(),[P(2,2)])])))

Yes0.030945

f(x)=2x

[(5,)]

[10]R(Z(),C(S(),[C(S(),[P(2,2)])])))

Yes0.040974

f(x,y)=

(x+y)

.−1

[(1,2)][2]

P(2,2)

No

0.003001

f(x,y)=

(x+y)

.−1

[(1,2),(2,1)][2,2]

C(S(),[R(P(1,1),R(P(2,1),P(4,3)))])

No

0.024993

f(x,y)=

(x+y)

.−1

[(1,2),(2,1),(2,4)][2,2,5]

R(R(Z(),P(2,1)),C(S(),[P(3,3)]))

Yes0.071959

f(x,y)=

(x+y)

.−1

[(13,6)]

[18]R(R(Z(),P(2,1)),C(S(),[P(3,3)]))

Yes0.141918

f(x,y)=

(x+y)

.−1

[(2,6),(3,0),(4,2),(1,5)][7,2,5,5]

R(R(Z(),P(2,1)),C(S(),[P(3,3)]))

Yes0.215976

f(x,y)=

(x+y)

.−1

[(16,2

4),(73,51),(1

27,245),(3

18,1

82)][39

,123,371,49

9]R(R(Z(),P(2,1)),C(S(),[P(3,3)]))

Yes2.172

63

f(x,y)=

(x+y)x

[(3,1)][12]

R(S(),R(P(2,2),R(P(3,1),C(S(),[P(5,5)]))))

No

1.26955

f(x,y)=

(x+y)x

[(3,1),(2,4)][12,12]

stuckin

C(S(),[R(S(),R(P(2,2),R(P(3,3),C(C(S(),[S()]),[P(5,5)]))))])

——

f(x,y)=

(x+y)x

[(0,1),(1

,0),(1,2),(2,1),(0,2),(2,0),(2,2)]

[0,1,3,6,0

,4,8]stuck

inR(P(1,1),R(P(2,1),R(P(3,1),R(P(4,1),R(P(5,1),C(S(),[P(7,7)]))))))

——

f(x,y)=

(x+y)x

[(0,1),(0

,2),(1,0),(1,1),(1,2),(2,0),(2,1),(2,2)]

[0,0,1,2,3

,4,6,8]

——

∼24

hoursdist(x

,y)=∣x−y∣

[(4,3)][1]

R(P(1,1),R(P(2,1),P(4,3)))

No

0.003997

dist(x

,y)=∣x−y∣

[(4,3),(2,6)][1,4]

C(R(Z(),P(2,1)),[R(S(),P(3,2))])

No

0.021986

dist(x

,y)=∣x−y∣

[(4,3),(2,6),(1,3)][1,4,2]

C(S(),[R(S(),R(R(P(1,1),P(3,2)),R(P(3,3),P(5,4))))])

No

17.2491

dist(x

,y)=∣x−y∣

[(4,3),(2,6),(1,3),(3,7)][1,4,2,4]

C(S(),[R(S(),R(R(P(1,1),P(3,2)),R(P(3,3),P(5,4))))])

No

17.8328

dist(x

,y)=∣x−y∣

[(4,3),(2,6),(1,3),(3,7),(10

,2)][1,4,2,4

,8]stuck

inR(S(),R(P(2,1),R(P(3,1),C(C(S(),[S()]),[P(5,5)]))))

——

dist(x

,y)=∣x−y∣

[(0,1),(0

,2),(1,0),(1,1),(1,2),(2,0),(2,1),(2,2)]

[1,2,1,0,1

,2,1,0]

C(R(S(),P(3,1)),[R(P(1,1),R(P(2,2),P(4,3))),

P(2,1)])

No

2.81838

dist(x

,y)=∣x−y∣

[(0,1),(0,2),(1

,0),(1,1),(1,2),(2,0),(2,1),(2,2),(2

,6)][1,2,1

,0,1,2,1,0,4]

R(P(1,1),C(R(S(),R(P(2,2),P(4,3))),[P(3,2),

P(3,1)]))

No

7.88448

dist(x

,y)=∣x−y∣

[(0,1),(0,2),(1

,0),(1,1),(1,2),(2,0),(2,1),(2,2),(2,6),(3

,4)][1,2,1

,0,1,2,1,0,4,1]

R(P(1,1),C(R(S(),R(P(2,2),P(4,3))),[P(3,2),P(3,1)]))

No

30.5075

dist(x)=

∣x−y∣

[(0,1),(0,2),(1,0),(1

,1),(1,2),(2,0),(2,1),(2,2),(2,6),(3

,4),(6,2)][1,2,1,0

,1,2,1,0,4,1

,4]—

—∼

24hours

exp(x

,y)=xy

[(2,3)][8]

R(P(1,1),C(C(S(),[S()]),[P(3,3)]))

No

7.88448

exp(x

,y)=xy

[(1,3),(2,3)][1,8]

stuckin

R(S(),R(P(2,1),R(P(3,1),C(R(Z(),P(2,2)),[P(5,4)]))))

——

exp(x

,y)=xy

[(0,1),(0

,2),(1,0),(1,1),(1,2),(2,0),(2,1),(2,2)]

[0,0,1,1,1

,1,2,4]

C(R(Z(),R(S(),P(3,1))),[R(P(1,1),R(P(2,1),R(P(3,3),C(S(),[P(5,5)]))))])

No

13770.8

exp(x

,y)=xy

[(0,1),(0,2),(1

,0),(1,1),(1,2),(2,0),(2,1),(2,2),(4

,1)][0,0,1

,1,1,1,2,4,4]

——

∼24

hours

TableA

.2:S

umm

aryofthe

experiments

made

bythe

scientistrelatedto

thesecond

algorithm

84

Func

tion

inp

List

out

List

Des

crip

tion

Foun

dA

dequ

ate

Tim

e(s

ec)

pred(x

)=x

. −1

[(0,)

][0

]EP(1,1)

No

0.0

312

47

pred(x

)=x

. −1

[(0,

),(1,)

][0,0

]EBS(EP(1,1))

No

0.0

156

45

pred(x

)=x

. −1

[(0,

),(1,),(

2,)]

[0,0,1

]EBS(EP(1,1))

No

0.0

156

26

pred(x

)=x

. −1

[(0,

),(1,),(

2,),

(3,)

][0,0,1,2

]EBS(EBS(EBP(EP(1,1))))

Yes

0.0

156

31

pred(x

)=x

. −1

[(1,),(

3,),(

5,),

(15,)

][0,2,4,1

4]

EBS(EBS(EBP(EP(1,1))))

Yes

0.0

119

96

pred(x

)=x

. −1

[(1,

),(3,),(

5,),(

15,),(

260,)

][0,2,4,1

4,2

59]

EBS(EBS(EBP(EP(1,1))))

Yes

2.74

342

zero(x)=

0[(

3,)

][0

]EBP(EP(1,1))

No

0.0

625

00

zero(x)=

0[(

0,),

(3,)

][0,0

]EC(EBP(EP(1,1)),[ES()])

Yes

0.0

312

48

f(x

)=2x

[(1,)

][2

]ES()

No

0.0

f(x

)=2x

[(1,

),(2,)

][2,4

]EC(ES(),[EBS(ES())])

No

0.0

f(x

)=2x

[(1,

),(2,),(

3,)]

[2,4,6

]EC(EA(),[EP(1,1),EP(1,1)])

Yes

0.0

156

24

f(x,y

)=(x

+y)

. −1

[(2,

3)]

[4]

EC(ES(),[EP(2,2)])

No

0.0

f(x,y

)=(x

+y)

. −1

[(2,

3),(

1,4)

][4,4

]EBS(EBP(EBP(EP(2,1))))

No

0.0

f(x,y

)=(x

+y)

. −1

[(2,

3),(

1,4),(

3,1)

][4,4,3

]EC(EBS(EBS(EBP(EP(1,1)))),[EA()])

Yes

0.0

937

45

f(x,y

)=(x

+y)x

[(1,

3)]

[4]

EA()

No

0.0

f(x,y

)=(x

+y)x

[(1,

3),(

2,5)

][4,1

4]

EC(ET(),[EA(),EP(2,2)])

Yes

0.0

312

19

f(x,y

)=(x

+y)x

[(3,1

),(4,1

),(0,6

),(2,4

)][1

2,2

0,0,1

2]

EC(ET(),[EA(),EP(2,2)])

Yes

0.0

079

97

f(x,y

)=(x

+y)x

[(2,

4),(

10,2

),(3,1

5),(

20,9

8),

(50,

120)]

[12,1

20,

54,2

360,8

500]

EC(ET(),[EA(),EP(2,2)])

Yes

0.0

169

94

sq(x

)=x2

[(2,)

][4

]EC(ES(),[ES()])

No

0.0

312

49

sq(x

)=x2

[(2,

),(3,)

][4,9

]EC(ET(),[EP(1,1),EP(1,1)])

Yes

0.0

468

71

exp(x,y

)=xy

[(2,

3)]

[8]

EBP(EP(2,1))

Yes

0.0

055

02

exp(x,y

)=xy

[(3,

5),

(9,3

),(1

5,2

),(2

0,6

)][[

243,

729,2

25,6

4000

000

]]EBP(EP(2,1))

Yes

0.0

069

94

fact(x

)=x

![(

2,)

][2

]EP(1,1)

No

0.0

fact(x

)=x

![(

2,),

(3,)

][2,6

]EBP(ES())

Yes

0.0

156

25

max

(x,y

)[(

3,4)

][4

]EP(2,2)

No

0.0

max

(x,y

)[(

3,4),(

5,2)

][4,5

]EC(EA(),[EM(),EP(2,2)])

Yes

0.0

156

25

max

(x,y

)[(

15,2

0),

(136,5

9),

(420,

767),

(520,1

0)]

[20,

136,7

67,5

20]

EC(EA(),[EM(),EP(2,2)])

Yes

0.0

210

94

min(x,y

)[(

3,4)

][3

]EP(2,1)

No

0.0

040

02

min(x,y

)[(

3,4),(

5,2)

][3,2

]EC(EM(),[EP(2,1),EM()])

Yes

0.0

079

96

sg(x

)[(

0,)

][0

]EP(1,1)

No

0.0

300

00

sg(x

)[(

0,),

(2,)

][0,1

]EBS(EP(1,1))

No

0.0

sg(x

)[(

0,),

(2,),(

3,)]

[0,1,1

]EBS(EBP(EP(1,1)))

Yes

0.0

sg(x

)[(

0,),(

2,),

(3,),(

5,),

(500,)

][0,1,1,1,1

]EBS(EBP(EP(1,1)))

Yes

0.2

351

58

sg(x

)[(

0,),

(2,),(

3,),(

5,),(

5000,)

][0,1,1,1,1

]EBS(EBP(EP(1,1)))

Yes

62.4

937

sg(x

)[(

0,)

][1

]ES()

No

0.0

009

99

sg(x

)[(

0,),

(1,)

][1,0

]EBP(EP(1,1))

Yes

0.0

009

97

dist(x,y

)=∣x−y∣

[(3,

2)]

[1]

EM()

No

0.0

039

82

dist(x,y

)=∣x−y∣

[(3,

2),(

1,6)

][1,5

]EBS(EBS(EBP(ET())))

No

0.0

156

27

dist(x,y

)=∣x−y∣

[(3,

2),(

1,6),(

2,5)

][1,5,3

]st

uck

inEC(EBS(EBP(EBS(ES()))),[EBP(EA())])

No

—dist(x,y

)=∣x−y∣

[(3,

2),(

1,6),(

2,1)

][1,5,1

]st

uck


No

—dist(x,y

)=∣x−y∣

[(3,

2),(

2,1),(

1,6)

][1,1,5

]st

uck


No

—dist(x,y

)=∣x−y∣

[(0,

1),

(0,2

),(1,0

),(1,1

),(1,2

),(2,0

),(2,1

),(2,2

)][1,2,1,0,1,2,1,0

]EC(EA(),[EM(),EC(EM(),[EP(2,2),EP(2,1)])])

Yes

3.51

511

half

(x)=

⌊x 2⌋

[(1,)

][0

]EBS(EP(1,1))

No

0.0

050

00

half

(x)=

⌊x 2⌋

[(1,

),(3,)

][0,1

]EBS(EBS(EP(1,1)))

No

0.0

058

36

half

(x)=

⌊x 2⌋

[(1,

),(2,),(

3,)]

[0,1,1

]EC(ED(),[ES(),EBS(ES())])

Yes

0.0

096

13

Ohm

’sLa

w[(

1,),

(3,),(

4,),

(6,)

][3,9,1

2,1

8]

EC(EC(EA(),[EA(),EP(2,1)]),[EP(1,1),EP(1,1)])

Yes

0.0

789

01

Ohm

’sLa

w[(

3,),

(9,),(

12,),(

18,

)][1,3,4,6

]st

uck

inEC(EBS(EBP(EP(1,1))),[EBP(ES())])

——

Ohm

’sLa

w[(

3,3),

(9,3

),(1

2,3

),(1

8,3

)][1,3,4,6

]st

uck

inEC(ED(),[EP(2,2),

EP(2,1)])

Yes

0.0

120

39

Kep

ler’s

Law

[(1,

),(4,),(

9,)]

[1,8,2

7]

stuc

kin

EC(EC(EBP(ES()),[EBP(ES())]),[EBS(ES())])

——

Gra

vita

tiona

lLaw

[(1,

),(2,)

][1

2,3

]EC(EC(EBS(EA()),[EBS(ES()),ES()]),[EC(ED(),[EP(1,1),ES()])])

No

473.

198

Gra

vita

tiona

lLaw

[(1,

),(2,),(

3,)]

[12,

3,1

]EC(EBP(EBP(EA())),[ES(),EC(ED(),[EP(1,1),EC(ES(),[ES()])])])

No

1827.8

6

Gra

vita

tiona

lLaw

[(1,

),(2,),(

3,),

(4,)

][1

2,3,1,0

]st

uck

inEC(EC(EBS(ED()),[EP(1,1),

EBP(EBP(ES()))]),[EC(ET(),[ES(),

ES()])])

——

Tabl

eA

.3:

Sum

mar

yof

the

expe

rimen

tsm

ade

byth

esc

ient

istr

elat

edto

the

third

algo

rithm

85

Appendix B

Implementation of the Algorithms

We now explain the implementation of the algorithms related to the developed scientists and the re-

spective Python files in which that implementation is made, present at the link https://github.com/

brunomcpatricio/AutomatedScientists.

In folder PRF, we have the implementation of the the symbols Z, S, P, C and R (i.e. the symbols used at

Definition 3.2.1 to define the inductive construction of a description for the primitive recursive functions)

as Python classes in file classesprf.py. This file is used for the implementation the enumeration

procedure described in Algorithm 7 realized in file myenum1.py, which was then used to the search

procedure regarding Algorithm 6, implemented in file search1.py, which also has the implementation of

the procedure that will write the code of a program in Python that computes the function described by the

given function and return it in a .txt file. The file classesprf.py is also needed for the implementation

of the second enumeration procedure for primitive recursive functions (algorithm 9) performed in the file

myenum2.py, which will then be needed for the implementation of the search Algorithm 8 done in the

file search2.py. This file also has implemented the procedure that will write the code of a program in

Python language from a given description and return it in a .txt file.

Folder El is reserved for the files needed to implement the scientist for the set of the elementary

functions. The class implementation of the symbols that make up the set of inductive rules used in Defi-

nition 4.2.1 to construct a description for the elementary functions is executed in the file classesel.py.

The enumeration of the set of functions E described in Algorithm 13 is implemented in file myenumel.py,

while the search procedure in Algorithm 12 is implemented in searchel.py, where it is also implemented

the procedure that will receive a description and write the code of a program in Python language that

computes the function described by the given description and return it in a .txt file.

Lastly, there is also a README.txt file in order for the user to know how to work with the software.

87

https://github.com/brunomcpatricio/AutomatedScientists

https://github.com/brunomcpatricio/AutomatedScientists

Automated Search of Functions and Synthesis of Code

Documents

Transcript of Automated Search of Functions and Synthesis of Code