Automated Search of Functions and Synthesis of Code
Transcript of Automated Search of Functions and Synthesis of Code
Automated Search of Functions and Synthesis of Code
Bruno Miguel Carrajola Patrício
Thesis to obtain the Master of Science Degree in
Mathematics and Applications
Supervisor: Prof. José Félix Gomes da Costa
Examination Committee
Chairperson: Prof. Maria Cristina De Sales Viana Serôdio SernadasSupervisor: Prof. José Félix Gomes da CostaMember of the Committee: Prof. Maria Paula Antunes Abrantes Gouveia
December 2019
ii
Acknowledgments
First and foremost, it is mandatory that I start by thanking my family, specially my parents. Without their
support, patience and sacrifice not one step of my academic path could have happened, let alone this
dissertation.
I would also like to thank my supervisor Prof. Jose Felix da Costa, who with his patience and constant
availability to help made this dissertation possible.
And to all my friends. Whether you have directly accompanied me along this academic journey and
shared with me blood, sweat and tears throughout the last five years (or even just a small part of it)
or you have not and had to deal with everything that I went through, most of the times without even
understanding a word about what I was saying, there are no words that can express how much I thank
you.
iii
iv
Resumo
O processo de descoberta cientıfica pode ser explicado como um ciclo que comeca com a observacao
de factos que nos rodeiam, modela essas observacoes em teorias, faz previsoes a partir dessas teorias
e depois confronta essas previsoes com outras observacoes, reforcando ou refutando essas teorias.
Na maioria das vezes, o elo fraco desta cadeia de eventos e o passo de inducao feito a partir de
factos concretos para teorias genericas porque nem sempre e facil para os cientistas encontrarem
estas correlacoes. Neste trabalho propomos uma solucao para esse problema: e se fosse possıvel
automatizar este passo e permitir que os computadores o fizessem? Isto pode ser alcancado se as leis
empıricas que os cientistas tentam encontrar forem nao so computaveis mas tambem estruturalmente
simples. Exploramos esta afirmacao ao relacionar estas leis com o conjunto das funcoes primitivas
recursivas (e posteriormente com um seu subconjunto, as funcoes elementares), permitindo apresentar
cientistas automaticos relativamente simples que seriam uma resposta inicial para este problema e um
ponto de partida para uma solucao mais completa e seria para a automacao do processo de inferencia
dedutiva.
Palavras-chave: Descoberta Cientıfica, Leis Empıricas, Cientistas Automaticos, Funcoes
Primitivas Recursivas, Funcoes Elementares, Geracao de Codigo.
v
vi
Abstract
The process of scientific discovery can be explained as a cycle that starts with observing facts that
surround us, models those observations into theories, makes predictions from those theories and then
confronts them with other observations, reinforcing or disproving those theories. Most of the times, the
weak link of this chain of events is the inductive step from concrete observed facts to general theories
because it is not always easy for scientists to find these correlations. In this work, we propose a solution
to that problem: what if we can automate that step and allow computers to do it? This can be achieved
if the empirical laws that scientists try to find are not only computable, but also structurally simple. We
explore this statement by relating these laws with the set of the primitive recursive functions (and later on
with a subset of it, the elementary functions), allowing us to present relatively simple automatic scientists
that would be an early response to this problem and a starting point into a more serious and complete
solution for the automation of the inductive inference process.
Keywords: Scientific Discovery, Empirical Laws, Automated Scientists, Primitive Recursive
Functions, Elementary Functions, Code Generation.
vii
viii
Contents
Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii
Resumo . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . v
Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vii
List of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xi
List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiii
List of Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xv
Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xvii
1 Introduction 1
2 Learning Theory 7
2.1 Computability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.2 Scientific methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
3 The Search Procedure 21
3.1 Primitive Recursive Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
3.2 Notation for identification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
3.3 The search algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
3.4 A first enumeration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
3.5 An improved enumeration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
3.6 From description to code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
4 A Restriction to E 41
4.1 Elementary functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
4.2 Notation for representation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
4.3 The search algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
4.4 Enumeration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
4.5 From description to code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
5 Results 55
5.1 First algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
5.2 Second algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
5.3 Third algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
ix
5.4 Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
6 Conclusions 75
6.1 Achievements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
6.2 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
Bibliography 77
A Functions tested and summarized results 81
A.1 List of functions tested with the scientists . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
A.2 Summarized results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
B Implementation of the Algorithms 87
x
List of Tables
3.1 Primitive recursive functions and their corresponding descriptions . . . . . . . . . . . . . . 27
4.1 Elementary functions and their corresponding descriptions . . . . . . . . . . . . . . . . . . . 46
A.1 Summary of the experiments made by the scientist related to the first algorithm . . . . . . 83
A.2 Summary of the experiments made by the scientist related to the second algorithm . . . . 84
A.3 Summary of the experiments made by the scientist related to the third algorithm . . . . . . 85
xi
xii
List of Figures
1.1 Common mathematical puzzle seen several times over social media . . . . . . . . . . . . . 2
1.2 Diagram representing the scientific discovery process . . . . . . . . . . . . . . . . . . . . . . 2
3.1 List of all the functions whose descriptions have the referred size . . . . . . . . . . . . . . . 32
3.2 Lists of descriptions with the referred size and arities . . . . . . . . . . . . . . . . . . . . . . 37
3.3 Code for function with description Z() . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
3.4 Code for function with description S() . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
3.5 Code for function with description P(3,1) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
3.6 Code for function with description C(P(3,2),[P(1,1),S(),S()]) . . . . . . . . . . . . . . . 40
3.7 Code for function with description R(Z(),P(2,1)) . . . . . . . . . . . . . . . . . . . . . . . . 40
4.1 Lists of descriptions for elementary functions with the referred size and arities. . . . . . . . 51
4.2 Code for function with description EA() . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
4.3 Code for function with description EM() . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
4.4 Code for function with description ET() . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
4.5 Code for function with description ED() . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
4.6 Code for function with description EBS(ES()) . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
4.7 Code for function with description EBP(ES()) . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
5.1 Results for the identity function searched by the first scientist . . . . . . . . . . . . . . . . . 56
5.2 Results for the successor function after the projection of the first argument searched by
the first scientist . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
5.3 Results for the second attempt for the addition function searched by the first scientist . . . 58
5.4 Results for the second attempt for the subtraction function searched by the first scientist . 58
5.5 Results for the repetition of the second attempt for the subtraction function searched by
the first scientist . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
5.6 Results for the second attempt for the product function searched by the first scientist . . . 60
5.7 Results for the second attempt for the product function searched by the second scientist . 63
5.8 Results for the third attempt for the function f(x, y) = (x + y) .− 1 searched by the second
scientist . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
5.9 Results of the fourth attempt for the function pred(x) = x .− 1 searched by the third scientist 67
xiii
5.10 Results for the second attempt for the function f(x, y) = (x + y)x searched by the third
scientist . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
5.11 Results regarding Ohm’s Law searched by the third scientist . . . . . . . . . . . . . . . . . . 72
xiv
List of Algorithms
1 Procedure to construct a text for ψ ∈ SD out of a scientist for AEZ . . . . . . . . . . . . . . 12
2 Recursive operator Φ used to prove the separation between Ex⋆ and Bc . . . . . . . . . . 15
3 Recursive operator Φ used to prove the separation between Bcn and Bcn+1 . . . . . . . . 17
4 Function that for a given prefix σ for a function ψ and a value x ∈ N outputs the value that
the scientist thinks the function ψ has when applied to x. . . . . . . . . . . . . . . . . . . . . 18
5 Scientist that Ex-identifies PRIM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
6 Search algorithm for a primitive recursive function given the input/output values . . . . . . 29
7 Construction of a function list composed by functions with a given description size . . . . 30
8 Search algorithm for a primitive recursive function given the input/output values having
into account the arity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
9 Construction of a function list composed by functions with a given description size and arity 34
10 Function that indicates if a description is already in a list of descriptions . . . . . . . . . . . 35
11 Scientist that Ex-identifies E . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
12 Search algorithm for an elementary function given the input/output values having into
account the arity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
13 Construction of a function list composed by elementary functions with a given description
size and arity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
xv
xvi
Notation
Notation for undefined
The concatenation operation..− The subtraction operation defined over the natural numbers, i.e. x .− y = max(x − y,0).ψ A recursive function.
χC The characteristic map of the set C.
χpC The characteristic functions of the set C.
P A computer program.
e The computer program with code e.
We The recursively enumerable set with code e.
R The set of recursive functions.
PRIM The set of the primitive recursive functions.
E The set of the elementary functions.
Pn,i The n-ary projection function that develops the i-the element of the input tuple.
φ An enumeration of the recursive functions.
φ(e), φe The recursive function with code e.
ρ An enumeration of the primitive recursive functions.
ρ(e), ρe The primitive recursive function with code e.
π An enumeration of the elementary functions.
π(e), πe The elementary function with code e.
T A text for a generic function ψ.
T (n), Tn The n + 1-th element of text T .
T [n] The n first elements of text T , i.e. T0, . . . , Tn−1.
T The set of all texts T .
σ A prefix of a text T .
σ(k), σk The k + 1-th element of the prefix σ.
content(σ) The set of pairs ⟨n,ψ(n)⟩ in σ.
σ A partial function that, for given n, returns m such that ⟨n,m⟩ is in prefix σ.
SEG The set of all prefixes for all texts.
INIT The set of the prefixes for texts in the canonical order.
M A scientist for functions.
xvii
=n A relation between two functions such that they differ in n points.
=⋆ A relation between two functions such that they differ in finitely many points.
Ex The class of Ex-identifiable sets of functions.
Exn The class of Exn-identifiable sets of functions.
Ex⋆ The class of Ex⋆-identifiable sets of functions.
Bc The class of Bc-identifiable sets of functions.
Bcn The class of Bcn-identifiable sets of functions.
Bc⋆ The class of Bc⋆-identifiable sets of functions.
m(i) ≤x Output of applying the program with code m to input i when it halts within x steps of
computation.
xviii
Chapter 1
Introduction
The process of scientific discovery has always been present at every scientific breakthrough, be it big
or small. It is this process that, having as base observable facts, allows mankind to express the laws
that govern nature into theories that explain them. These theories, when worked upon, can produce
predictions about what happens in the future, which then can be used to reinforce the theories with
which we began in the first place or refute them, giving space for different and more accurate theories to
appear. We then have a scientific discovery method that is a never ending cycle of inducing theories from
data, deducing predictions from theories and validating or denying said theories through the analysis of
the veracity of those predictions.
In objective terms, it is obvious that, even though this process is cyclical, it has a beginning: the
inductive step, which derives from identifying patterns in information collected from observations. Gen-
erally speaking, this induction process is constantly present in our daily life: for example, if our shirt
has a stain, we can only guess – and by guess, we mean infer – what that dirtiness might be through
its colour, texture and/or smell (and with that theory, predict the best way to clean it); and if our car
has a dent in the morning, unfortunately all we can do is imagine (i.e. infer) all the possible scenarios
that could have taken place over the night through the analysis of the dent. Sometimes, we are even
presented with some more “mathematical” situations. If we are surfing social media it is usual for us to
be presented with a mathematical puzzle like the one in Figure 1.1. The truth is that we observe this
image and we retain the following data: three apples together sum 30, one apple and eight bananas,
18, and four bananas subtracted by two coconuts has as result 2. This is what we are presented with
and what we are going to use to try to identify patterns in order to formulate our theory, which will be the
following: one apple equals to 10, one banana to 1 and one coconut to 1. With this theory, we perform
our prediction for the operation in the fourth line: one coconut plus one apple and three bananas sums
up to 14. In this case, since we have no more information, we cannot proceed to verify if our prediction
is correct (as a matter of fact, this problem can simply be reduced to a system of three equations with
three variables, to which mathematical rules tell us that there is only one valid solution. This does not
mean that this is the end; for example, if a new operation was added with more fruits, i.e. variables, we
would need to change our theory in order to include possible values for them).
1
Figure 1.1: Common mathematical puzzle seen several times over social media
Figure 1.2: Diagram representing the scientific discovery process
This apparently simple process has been followed by scientists in order to discover theories that
explain what surrounds us, both at microscopic and/or macroscopic level. To do so, they rely on a
very useful tool that allows them to write what they find in a very clear and objective way: Mathematics.
Mathematics is a universal language in which scientific theories are formalized in order to be understood
and worked upon. In fact, Mathematics and mathematical propositions do not need facts to be tested
or validated, pure reason suffices (see [22]). That property of Mathematics is its greatest advantage
from a scientific point of view: since it does not need facts, scientists can work it beyond the empirical
data observed and thus perform predictions that will once again fall upon the realm of the observable.
In Figure 1.2 we can see a scheme from [22] that illustrates this process. The goal of a scientist is then
to find a theory that will not change through the course of this cyclical process, i.e. a theory that will not
be refuted by any upcoming observable facts, only reinforced by them (this theory is often called Theory
of Everything or United Field Theory). This is obviously an extremely and maybe even impossible task
to perform, since it is probably not possible to observe every fact in the universe and, even for the facts
that we can observe, it is not obvious how some of them are related. This means that sometimes usually
the best a scientist can do is describe conjectural theories that at least explain a part of the observable
universe, in hopes that in the future more universal theories will be formalized. A good example for
this (seen in detail in [22]) are the Newton’s Laws of Physics: they are extremely useful to explain the
physical motion of macroscopic elements; moreover, from them we can deduce other more specific
theories, like Kepler’s Laws of planetary motion, Galileo’s Laws of Motion or the Law of Tides. However,
it cannot predict or explain the motion of light rays. This means that Newton’s Laws can be considered an
unchangeable theory for a scientific discovery process that worries only with the motion of macroscopic
bodies, but it is not a universally valid theory for Science.
2
Until recently, the scientific discovery process was more a question of philosophical reflection than of
scientific action: scientists did it without thinking about it, philosophers thought about it without putting
it in action. However, that changed a few years ago when some scientists started to study this process
with a scientific predisposition, trying to formalize it through a clear set of rules. They realized that if this
is possible, then scientific discovery could be algorithmically structured and then we would be able to
achieve scientific advances and find theories much faster while obtaining much more complex results. If
this normalization is possible, then we need to to understand if machines can learn these inherent pro-
cesses and use them to discover scientific laws themselves. However, this will only be possible if natural
laws are computable (as defended by Kelly in [21] and Szudzik in [38] with the computable universe hy-
pothesis) and, beyond that, are simple enough to be discovered by using these kinds of processes. This
would mean that these laws would have to be algorithmic themselves and, consequently, the expected
behaviour of the world, if not its exact behaviour, would also have to be algorithmic (even if infeasibly
computable). A brief discussion about this problem can be found in [6]. One example for this was made
by Gold in [16], where through the analysis of children learning a language, and in his own words, it
was presented a construction for a “precise model for the intuitive notion ’able to speak a language’ in
order to be able to investigate theoretically how it can be achieved artificially”. A more practical and
more recent example can be seen in [17], where the learning of arithmetic is studied, formalized and
then taught to a neural network.
In [25] we can observe the construction of considerably general systems that are capable of achieving
significant scientific discoveries — for example, the BACON programs. In this book, we can see that
mathematical expressions that translate some very important and ground breaking scientific laws were
learned using these protocols, like the ideal-gas law (that relates the pressure P and the volume V
of an ideal gas with n moles at temperature T in Kelvin — PV = nRT , where R is the ideal gases
constant that is the same for all ideal gases), the law of gravitation (which states that the gravitational
force between two objects is directly proportional to the product of the masses of both objects and
is inversely proportional to the square of the distance between them — F = Gm1m2
d2where G is the
gravitational constant) or Kepler’s law (stating that the cube of the distance of a planet to the sun is
inversely proportional to the square of the period of the planet’s orbit — D3/P 2 = k). This means that
these laws, presented to us as core laws to explain the theories that allow us to perceive the workings of
the universe, are actually algebraically simple enough to be learned by a program with simple heuristics.
In fact, these laws are of the algebraic form xaybzc ⋅ ⋅ ⋅ = const ∈ R, a, b, c, ⋅ ⋅ ⋅ ∈ Z or some sort of linear
combination of them. Even laws with trigonometric functions can be learned by these proceedings by
using the trigonometrical relations in a rectangle triangle (for example, Snell’s law of optics, that relates
the sines of the incidence and refraction angles θ1, θ2 with the indices of refraction n1, n2 of the two
media in question — n1 sin θ2 = n2 sin θ1 — was learned using the definition of sine that states that this
trigonometrical relation of an angle in a rectangle triangle is given by the length of the opposite leg over
the length of the hypotenuse).
In more recent years, there have been considerable advances in this field, being that one of the most
important is the development of robotic scientists ADAM and EVE. The first one was able to perform
3
a scientific discovery about genetic encoding in the yeast Saccharomyces cerevisiae entirely on itself,
including the formulation of a hypothesis and its subsequent verification, while the second one performs
experiments over chemical genetics and drug design, having already found a possible relation between
triclosan, a common toothpaste ingredient, and the fight against malaria (see [3], [23] and [37]). Another
idea that shows the advance of scientific discovery nowadays is the dream of a scientist that is able to
perform discoveries so advanced that they would be worthy of a Nobel Prize all by itself (see [24]).
In our dissertation, we propose ourselves to develop a scientist that can identify the mathematical
expressions that explain the natural laws that surround us and return said mathematical expression as
a computer program in Python language. In a first impression we would be tempted to think that the
ideal way to develop a scientist like this is to develop an algorithm that would work upon the entire class
of recursive functions, which would be real functions of real variable. However, we easily see that we
only need to consider the recursive natural functions, since real numbers are not measurable directly
in nature and it is possible to transpose rational data to natural data, at least in computable models
(see [38]). Moreover, we acknowledge the simplicity of natural laws (as seen above) which we use
to our advantage by developing a scientist only for the learning of primitive recursive functions (we do
this because we believe that the functions outside the class of the primitive recursive functions are too
complex to explain natural laws and also because it is not possible to explain the whole class of recursive
functions by a brute-force algorithm while it is for the primitive recursive ones, as seen in [9] and explored
further ahead in our work, simplifying the construction of our scientist).
To construct said scientist there is a need to learn as much as we can about Learning Theory (using
notions of computer science, since we will be working on the basis of the computable universe hypothe-
sis), so that we can understand the correct way for executing this construction and the possible learning
capabilities of the scientist.
Next, we need to study the primitive recursive functions to learn how we can enumerate them. To
do so, we will define them through their descriptions and use the number of symbols in each description
to do said enumeration. This is a very complex problem that in order to be addressed in a complete
way there would have to be an extensive work only dwelling in this subject (for example, the doctoral
thesis developed by Rogerio Reis about the enumeration of automata in [33]). This means that we will
only dwell on this problem just enough so that we can carry out the work we are proposing to do. The
last step regarding the work performed upon the primitive recursive function is to generate the code of
a function through its description. Since the functions with which we are working are only the primitive
recursive ones, we know that the function in question can be encoded by a program whose loops are
restricted to nested and/or sequential for-loops (see [30]).
Moreover, we will perform the same work done with the primitive recursive functions for a subset of
this class: the elementary functions. We do so, because we also believe that this subset of functions is
enough to express the natural laws we are trying to identify.
Regarding the outline of this dissertation, in Chapter 2 we have the theoretical knowledge about
Learning Theory we need to have in mind to construct the scientist; in Chapter 3 we perform the study
of the primitive recursive functions, present a form of identification, two forms for listing this set and,
4
for each one of them, a search procedure that will be the base of construction for a scientist; lastly, we
present a way of transforming a description of a primitive recursive function into code written in Python
language. Chapter 4 is reserved for the study of elementary functions, a method to describe them,
its enumeration, a search procedure constructed for finding these functions and a portrait of how to
transform one into a program written in Python language. Lastly, in Chapter 5 we present the results
achieved by testing our scientist and discuss them while in Chapter 6 is were we draw the conclusions
of our work.
5
6
Chapter 2
Learning Theory
We can define Learning Theory as “the study of systems that map evidence into hypotheses” [32]. The
main goal of Learning Theory relies on trying to find the circumstances under which these hypotheses
stabilize to an accurate representation of the environment from which the evidence is drawn, case in
which it is said that learning is successful.
The following concepts and definitions come from Osherson et al. [32], to whom the paper [16] had
a big influence. It is assumed that learning involves the following four concepts:
1. A learner, or scientist;
2. A subject to be learned;
3. An environment, in which the thing to be learned is exhibited to the learner;
4. The conjecture that occurs to the learner about the subject to be learned on the basis of the
environment.
A learning paradigm is a specification of these four concepts. This means that Learning Theory
can also be defined as the study of learning paradigms. One of these is the identification of recursive
functions by a scientist; in a more concrete way, the problem of understanding which recursive functions
can be identified by which scientists and under which conditions that identification is made. It is through
this learning paradigm that we will try to identify the empirical laws, what will be performed by using
the computationalist hypothesis (see Kelly in [21] and Case in [6]), where we assume that both the
empirical laws and scientific methods are recursive relations. This means that it is reasonable to accept
that a law written in standard fashion (i.e. as an algebraic expression) and a computer program are
interchangeable and so we can conclude that the identification of computable functions is a way to
identify those empirical laws. Thus, by understanding how to identify recursive functions we will be
understanding also how we can be able to discover the empirical laws. That problem will be solved
by attacking the computational limits of what is learnable by a scientist and the rigidity of the learning
criteria of said scientist within this paradigm.
7
2.1 Computability
Since our study will fall under the identification of (computable) functions, we first need to recall some
computability theory notions that can be found in [10], [11] and [35].
Generally we can encode any abstract objects into natural numbers through the concept of Godeliza-
tion.
Definition 2.1.1. Godelization
LetW be a set. A computable1 one to one total function g ∶W → N is called a Godelization if:
a) g(W) is a decidable set in N.
b) g−1 ∶ g(W)→W is also a computable total function.
A Godelization can be defined for sets, lists of numbers, finite graphs, etc., which means that it is
possible to provide as input all of these structures to a program P that receives only natural numbers as
input. Actually, it is even possible to provide to a program P a natural number n that is first decoded into
a number m, that is a code of another program P ′, and into a number k, such that P with input n returns
the output of P ′ for input k. It is thus possible to encode as a natural number a various panoply of objects
that can then be provided as input for a program that receives natural numbers as input, including tuples
of numbers. This means that we can use the unary functions as notation for the entire set of functions.
Furthermore, the programs that compute these functions, for a given input, can either halt at some point
or run forever, however the existence of a programH that receives a number n, decodes it into a program
P of code m and a k and outputs 1 if P halts with input k and 0 otherwise is not possible (undecidability
of the halting problem). This means that we cannot know if a program, for a given input, will run forever
or if it just needs time to terminate the computation, which makes it difficult to understand if a program
is defined for a partial or a total function, concepts we define below.
Definition 2.1.2. Partial Recursive Function
A partial recursive function ψ can be defined by its graph, i.e. the set of input/output pairs (n,ψ(n))such that if P is a program that computes ψ then for all values of n on which P halts, it returns ψ(n). For
the values in which P does not halt, we say ψ is undefined. If e is the code of the program P in question,
then ψ will be denoted as φe and e will denote P .
Definition 2.1.3. Domain of a partial recursive function
Let ψ be a partial recursive function, computable by a program P . Then the set of numbers n such
that there is a pair (n,ψ(n)) in the graph of ψ is called the domain of ψ. In other words, the domain of ψ
is the set of numbers to which P halts.
Definition 2.1.4. (Total) Recursive function
A function ψ is recursive if it has as domain the set of natural numbers N, i.e. is total; in other words,
the graph (n,ψ(n)) that defines ψ has an element for each value n ∈ N. If P is a program that computes
ψ then for all values of n, P halts and it returns ψ(n). The set of all recursive functions is denoted by R.
1In this definition, we understand the concept of computable function in the Church-Turing sense.
8
Definition 2.1.5. Characteristic map and characteristic function
Let C be a subset of N. Then the characteristic map of C, χC ∶ C → N, is defined as
χC(x) =⎧⎪⎪⎪⎪⎨⎪⎪⎪⎪⎩
1, x ∈ C
0, x ∉ C
The characteristic function of C, χpC ∶ C → N, is defined as
χpC(x) =⎧⎪⎪⎪⎪⎨⎪⎪⎪⎪⎩
1, x ∈ C
undefined, x ∉ C
Definition 2.1.6. Recursively Enumerable Set and Recursive Set
A set S is said to be recursively enumerable if there is a program P that computes its characteristic
function. A set S is said to be recursive if there exists a program P that computes its characteristic map.
In either case, if e is the code for P , then we can denote S as We.
There are a few situations where it will be useful to consider not only the unary programs for unary
functions but also the programs for functions ψ that explicitly receive two input numbers n,m ∈ N and
return the value for ψ(n,m). For those situations, the following theorems are of special importance.
Theorem 2.1.1. s − 1 − 1 theorem for binary functions
For any fixed value m ∈ N, there is a computable total function g such that ψ(m,n) = φg(m)(n). This
means that for any arbitrary m, g(m) is the code of ψ(m,n).
Theorem 2.1.2. Kleene’s Theorem
For any binary partial recursive function ψ there is a number e ∈ N such that e(x) = ψ(e, x). In other
words, φe(x) = ψ(e, x).
2.2 Scientific methods
We will now begin defining important concepts of Learning Theory, present in [9], [11], [16] and [32].
Definition 2.2.1. Text for a function
A text T for a function ψ is a total function T ∶ N→ N2 such that for every a, b ∈ N, (a, b) ∈ range(T )⇔ψ(a) = b.
The set of all the texts T for functions is denoted as T . Tn denotes the pair T (n), while T [n] denotes
the sequence of pairs T0 . . . Tn−1. A text allows repetitions and is sensible to the order of its pairs, which
means that there is an uncountable number of texts for a function ψ. A text is thus a function whose
domain is important to give an order to the pairs contained in its range.
Definition 2.2.2. Text in canonical form
Let T be a text for a function ψ. T is said to be in the canonical form if T (i) = (i, ψ(i)) for any i ∈ N.
9
Definition 2.2.3. SEG = T [n] ∶ T ∈ T , n ∈ N is called the set of prefixes of recursive functions.
INIT ⊂ SEG is the subset of prefixes of texts in canonical form.
Let σ be an element of SEG. Then, content(σ) provides the set of pairs in σ. This sequence can be
seen as a partial function, denoted by σ and defined as
σ(m) =⎧⎪⎪⎪⎪⎨⎪⎪⎪⎪⎩
n if (m,n) ∈ content(σ)
otherwise
Since σ is a prefix of a function then it is not possible to have in content(σ) two pairs such as (m,n1)and (m,n2) with n1 ≠ n2 because any text for a function cannot have such pairs in its range, and so the
partial function σ is well defined.
Definition 2.2.4. Scientist
A scientist for functionsM is on itself a function such thatM ∶ SEG→ N.
The essential feature of a scientist is that it turns finite information into a hypothesis that covers
infinite values.
Definition 2.2.5. Convergence of scientist
A scientist for functionsM converges to i ∈ N on text T for a function if there exists p ∈ N such that
for t > p,M(T [t]) = i.
Definition 2.2.6. Ex-identification of functions
A scientist for functionsM Ex-identifies a function ψ for a text T if it converges on a conjecture i ∈ Nsuch that φi = ψ when provided text T .
A scientist for functionsM Ex-identifies a function ψ if, for every text T for ψ provided, it converges
for conjectures that are code for ψ, i.e. for any T for ψ the conjecture returned i ∈ N, which can differ
depending on T , is one such that φi = ψ.
A scientist for functionsM Ex-identifies a set of functions Ψ ifM Ex-identifies every function ψ ∈ Ψ.
The class of all the sets of recursive functions Ex-identifiable by a scientist is denoted Ex.
Remark: Ex comes from Explaining
Lets see that the class Ex is not empty. We define the set AEZ (Almost Everywhere Zero) as the
set of total recursive functions that take the value 0 for all but finitely many values and the set SD (Self-
Describing) as the set of recursive functions ψ such that ψ(0) is the code of a program P for ψ. Both
these sets are Ex-identifiable:
• For AEZ we build a scientist such that on input a prefix σ of a text for a function ψ it builds a
ordered list µ of non-zero values in σ and outputs the code of the function that receives as input
a natural x and executes the instruction If x ∈ dom(µ) Then µ(x) Else 0. Since a function in
AEZ is zero for all but finitely many points, then for a sufficient large prefix σ all the values of ψ
that are not in µ will be 0 and so the code outputed by the scientist will be a code for ψ. Thus
AEZ ∈ Ex.
10
• For SD we build a scientist that searches for the pair (0, ψ(0)) and outputs the value ψ(0) that by
definition is a code for ψ. For a sufficient large prefix, that value will be in the prefix, and so the
scientist will, from a point on, return a code for ψ for certain. Thus SD ∈ Ex.
Definition 2.2.7. Total scientist
A scientist M is total on a recursive function ψ if it provides conjectures for any prefix of any text
regarding ψ given to the scientist.
M is total on a set of functions Ψ if it is total on every function ψ ∈ Ψ.
M is total if it is total on the whole set of recursive functions R.
Proposition 2.2.1. (see [32]) For each scientist M for functions, there exists another scientist N for
functions, algorithmically obtainable from M, such that N is total and if M identifies any recursive
function ψ then so does N .
Proposition 2.2.2. (see [32]) Let M be a method that Ex-identifies the recursive function ψ. If Mconverges to a conjecture e on the canonical text for ψ, then there exists a scientistM′ that converges
to e on all texts for ψ.
The last two propositions are of special relevance because they mean that it is possible to go from any
scientist to one that identifies the same set of functions and is canonical and total, i.e. the achievements
of a scientist that does not need to receive texts in canonical order are the same as the ones who do,
which allows us to introduce concepts and results constructed only by presenting texts in canonical
order. Furthermore, being able to only use texts in canonical order means that, from now on, every
time we write about a text T for ψ we can simplify notation and write only ψ, since the T in question
is canonical and the order of the elements of a text in canonical order coincides with the order of the
values of function ψ.
We have been presenting definitions and results regarding scientists for natural functions of natural
variable. However, the experimental measures hardly are of said nature. However, by rescaling and
encoding the values of physical magnitudes, we can define physical laws as relations between natural
numbers. In a world of experimental error, convergence can be addressed with a different definition:
Definition 2.2.8. We say that the scientistM identifies ψ ∈R if there exists an e ∈ N and numbers p, l ∈ Nsuch that, for t ≥ p, M(ψ[t]) = e and, for all t ∈ N, ∣φe(t) − ψ(t)∣ ≤ 2−l, where the line over the functions
means the decoding of natural numbers into rational numbers. 2
Even though our motivation is the identification of empirical laws, that are mainly not natural functions,
we will develop our work by assuming natural values for empirical observations, for simplification.
Another important result comes from the Nonunion Theorem, explored below.
Theorem 2.2.1. (see [4]) Nonunion Theorem
If a scientist for functions M Ex-identifies a set of functions E1 and another scientist for functions
M′ Ex-identifies a set of functions E2 then it is possible but not certain that there exists a scientist that
Ex-identifies the set E1 ∪ E2. In other words, the class Ex is not closed under union.2In the standard context of learning theory, we take l = +∞, and we haveM converging to e on ψ[t] and, for all t ∈ N, φe = ψ.
11
Lets see an example for two classes of functions that are Ex-identifiable but are such that its union
is not. We already saw before that both AEZ and SD are in Ex. To observe that AEZ ∪ SD ∉ Exwe will show that if a scientist Ex-identifies AEZ then it cannot Ex-identify SD. LetM be an arbitrary
scientist that Ex-identifies AEZ and ψ a function of SD such that ψ(n) is either 0 or 1, for n > 0. For
any sequence of (canonical) observations σ of ψ, there exists always a lexicographically strictly longer
extension τ of σ such that M conjectures differently on τ and σ. We know this happens because
otherwise it would not be possible for the scientist M to distinguish between σ (n,0) and σ (n,1)3
although they are prefixes of texts for different recursive functions. To construct a text for this function ψ,
we observe the procedure present in Algorithm 1. By Kleene’s Theorem we know that there is a value
p ∈ N such that Γ(p, x) = φp(x), with p = ψ(0) by construction of σ. We know as well that Γ(e, x) = ψ(x)since σ is a partial subfunction of ψ, also by construction. If we provide the value p to Γ, then we have
that Γ(p, x) = φp(x) = ψ(x) and so ψ belongs to SD and M cannot Ex-identify limσ = ψ. Since Mis an arbitrary scientist for AEZ, then there is not a scientist for AEZ that Ex-identifies SD. Thus
AEZ ∪ SD ∉ Ex.
Algorithm 1 Procedure to construct a text for ψ ∈ SD out of a scientist for AEZ
function Γ(e, x ∶ N)∶ Nσ ∶= (0, e)for i ∶= 1 to +∞ do
τ0 = σ (i,0)τ1 = σ (i,1)ifM(σ) ≠M(τ0) then
σ ∶= τ0else
σ ∶= τ1end ifif x ≤ i then
return σ(x)end if
end forend function
The following proposition is a corollary of the previous result.
Proposition 2.2.3. R ∉ Ex.
Therefore, there is no scientist that Ex-identifies all the recursive functions. However, if we weaken
the identification criterion we will be able to capture larger collections of functions, eventually arriving to
R. We will start by allowing the occurrence of anomalies in the identification of functions.
Definition 2.2.9. n-variant
A partial recursive function ξ is an n-variant of a function ψ ∈ R if it coincides with ψ in all but finitely
many points never exceeding n. We write ψ =n ξ.
Definition 2.2.10. ⋆-variant
A partial recursive function ξ is a ⋆-variant of a function ψ ∈ R if it coincides with ψ in all but finitely
many points. We write ψ =⋆ ξ.3 is the symbol for the concatenation operation
12
Definition 2.2.11. Exn-identification
A scientistM Exn-identifies a function ψ ∈ R if there exists an order p ∈ N such that, for every t ≥ p,M(ψ[t]) = e and we have φe =n ψ. A set of functions is said to be Exn-identifiable if it exists a scientist
that Exn-identifies every function in that set.
Definition 2.2.12. Ex⋆-identification
A scientistM Ex⋆-identifies a function ψ ∈ R if there exists an order p ∈ N such that, for every t ≥ p,M(ψ[t]) = e and we have φe =⋆ ψ. A set of functions is said to be Ex⋆-identifiable if it exists a scientist
that Ex⋆-identifies every function in that set.
Definition 2.2.13. Exn and Ex⋆ are the corresponding classes of sets that are Exn- or Ex⋆-identifiable.
Remark: Ex0 = Ex
We will now see that the following sets of functions are in the previously defined classes Exn and
Ex⋆:
Definition 2.2.14. ASDn, for n ∈ N, is the set of all ψ ∈ R such that φψ(0) =n ψ, i.e. ψ(0) is an index of
an n-variant of ψ. Respectively, ASD⋆ is the set of all ψ ∈ R such that φψ(0) =⋆ ψ, i.e. ψ(0) is an index
of an ⋆-variant of ψ.
It is easily observed that for any n ∈ N, ASDn ∈ Exn: we construct a scientist that outputs 0 until
it finds the value ψ(0) at which point he outputs it from that moment on (in fact, since the scientist is
defined for texts in canonical order, this value will be the first in the text provided). Using the same
reasoning, it follows that ASD⋆ ∈ Ex⋆.
However, it is possible to show that ASDn+1 is not in Exn. Lets suppose that ASDn+1 is, in fact,
Exn-identifiable. Then there is a scientist that Exn-identifies all the functions in ASDn+1. LetM be an
arbitrary scientist in that condition. The proof is made by providing a function in ASDn+1 such that it
is possible to extend a prefix σ for that function by concatenating suitable segments τ ∈ SEG that will
make the scientist change his mind every time we make this process, i.e.,M(σ) ≠M(σ τ). Then, by
Kleene’s Theorem, we have that the scientist that supposedly Exn-identifies ASDn+1 does not converge
when presented the text limσ, which is a text for a function in ASDn+1. In the case of not being possible
to extend the prefix in a way such that the scientist keeps changing his mind, then the construction is
made in a way that prevents M from distinguishing between n + 1 different functions in ASDn+1. This
proof can be seen in detail in [7, 8]) and [10].
This means that ASDn+1 ∈ (Exn+1 ∖Exn). It also allows us to infer that ASD⋆ ∈ (Ex⋆ ∖⋃n∈NExn)because if not there would be an n ∈ N such that ASD⋆ ∈ Exn. By definition, we know that for any k ∈ N,
ASDk ∈ ASD⋆ and so we would have that ASDn+1 ∈ Exn, which we already seen cannot happen. We
can then make the following statement regarding the hierarchy of the classes Exn:
Proposition 2.2.4. (Case and Smith [7, 8]) Ex ⊂ Ex1 ⊂ Ex2 ⊂ ⋅ ⋅ ⋅ ⊂ Exn ⊂ Exn+1 ⊂ ⋅ ⋅ ⋅ ⊂ Ex⋆.
This means that the cognitive power of a scientist enhances by increasing the number of errors a
scientist is allowed to make, as long as they are finite. To understand that R is not in Ex⋆, we need
13
to relax the identification criterion furthermore by not demanding the canonical scientist to converge to
a single conjecture from a certain order on but to be able to change that conjecture, provided that the
scientist’s outputs are always appropriate ones. We will later see what appropriate means in this case.
This form of identification is called Bc-identification:
Definition 2.2.15. Bc-identification
A scientist for functionsM Bc-identifies a recursive function ψ ∈R if there exists an order p ∈ N such
that, for any t ≥ p we have that φM(ψ[t]) = ψ.
A scientist for functionsM Bc-identifies a set of functions Ψ ifM Bc-identifies every function ψ ∈ Ψ.
The class of all the sets of functions that are Bc-identifiable is denoted by Bc.
Remark: Bc comes from Behaviourally Correct
The next step will be to show that the hierarchy of identification does not stop at Ex⋆ and continues
with the Bc class, i.e. every function that can be identified syntactically with finitely many errors can be
identified semantically without any error. To show that, let S ∈ Ex⋆, ψ ∈ S, M a generic scientist that
witnesses the inclusion of S in Ex⋆ and σ ∈ SEG a prefix for ψ. The idea is to build a scientistM′ such
that for the input ofM,M′ simulatesM obtaining a certain code e =M(σ) and constructing a looking
up ordered list of pairs µ of the different elements (i, σ(i)) in σ. With these elements, it outputs the code
of a function that uses said list µ in the following way: given input x ∈ N, it checks if x is the first element
of a pair in µ. If it is, then it outputs the second element of that pair; if it is not, then it outputs the result
of the program e applied to element x, i.e. e(x) where e is the code returned from scientistM with
input σ. The code of this function returned by M′ is not necessarily the same because it depends on
the constructed set µ which depends on σ. Thus, as the scientist reads new information on σ the code
returned byM′ will necessarily change but it is always a code for ψ from some order on, and so ψ ∈ Bcand consequently S ∈ Bc. So, we have the following proposition:
Proposition 2.2.5. Ex⋆ ⊆ Bc.
The next step is to show that Bc ∖Ex⋆ ≠ ∅. For us to be able to understand the separation proof of
the classes Ex⋆ an Bc we need to introduce the concept of operator. Let F be the class of all partial
functions such that f ∈ F is of the type f ∶ N → N. Then an operator is a total function Φ ∶ F → F . From
this point on, σ ∈ F represents a finite function (i.e. a function with finite domain) and let σ be the natural
number encoding of the function σ. Then we say that an operator Φ is recursive if there is a binary partial
recursive function δ such that for every function ψ ∈ F , for all x, y ∈ N we have Φ(ψ)(x) = y if and only if
there is a finite function σ such that σ is a subfunction of ψ (i.e. the graph of σ is included in the graph of
ψ) and δ(σ, x) = y. With these in mind, we can state the following proposition:
Proposition 2.2.6. If Φ ∶ R → R is a recursive operator, then there is a recursive monotone increasing
function h ∶ N→ N such that for all n,x ∈ N, we have φh(n)(x) = Φ(h)((n,x)).
For the separation proof we will also need to define the following set of functions:
Definition 2.2.16. S is the set of all ψ ∈ R such that for all but finitely many i ∈ N, φψ(i) = ψ, i.e. for all
but finitely many i ∈ N, ψ(i) is an index of ψ.
14
It is obvious to see that S ∈ Bc. We just need to consider the scientist that outputs the last value that
it sees thus far while reading the input. By definition, from a certain point on that value will be the code
of the function to which the prefix is for, and thus any function in S can be Bc-identified by this scientist.
To show that the inclusive relation between the classes is a strict one, lets consider the recursive
operator defined in Algorithm 2. LetM be a scientist that witnesses the Ex⋆-identification of S. Also, let
hk denote h(k). We will use the operator in question to show that there are functions in S that cannot
be Ex⋆-identified byM.
Algorithm 2 Recursive operator Φ used to prove the separation between Ex⋆ and Bc
function Φ(h ∶ N→ N; (k, x) ∶ N) ∶ Nvar σ ∶ N→ SEG; m,y, s ∶ N
y ∶= 0σ0 ∶= (0, h0);if (k, x) = 0 then return h0;end iffor m ∶= 0 to +∞ do
σ2m+1 ∶= σ0;σ2m+2 ∶= σ0;whileM(σ2m+1) =M(σ0) andM(σ2m+2) =M(σ0) do
y ∶= y + 1;σ2m+1 ∶= σ2m+1 (y, h2m+1);σ2m+2 ∶= σ2m+2 (y, h2m+2);if k ∈ 2m + 1,2m + 2 and (k, x) = y then return hkend if
end whileifM(σ2m+1) ≠M(σ0) then σ0 ∶= σ2m+1else σ0 ∶= σ2m+2;end iffor s ∶= 1 to 2m do
σs ∶= σs (∣σs∣, σ0(∣σs∣)) ⋯ (y, σ0(y));end forif k ≤ 2m and (k, x) ≤ y then return σk((k, x))end if
end forend function
Lets consider the function h that Proposition 2.2.6 states it exists. By applying this function h to the
algorithm we can conclude some results.
First, all the sequences σ0, σ1, . . . , σ2m, σ2m+1, σ2m+2 are prefixes of potential graphs of total functions.
This is true due to the fact that, if the while loop halts for every m, then the domain of each one of the
functions σk,1 ≤ k ≤ 2m is updated in the internal for-loop in order to follow the values of σ0; in the limit,
these functions would be total. Then, we have to consider two cases:
1. The while guard fails to be true once for every m. This means that, for every k ∈ N, lim σk is a total
function. Moreover, and since φhk(x) = Φ(h)((k, x)) for every k ∈ N, hk is an index of h in all but
finitely many points. This happens because for i, j ∈ N such that i ≠ j either φhi and φhj coincide
in all points or differ only in a finite number of points. The first case happens when σ0 follows some
σk, for some k even or odd, until some order l, and then σk follows σ0; thus, σk and σ0 coincide in
every point. The second case happens because when σ0 follows a σk for a certain k, let us say
15
an odd k without loss of generality, until an order l, it does not follow σk+1, and so both σ0 and σk
differ of σk+1 up to order l; however, from order l onwards both σk and σk+1 follow σ0, and so the
difference between σ0 and σk+1 and σk and σk+1 will be happening only in finitely many points. In
conclusion, we have that φhk∈ S for every k ∈ N. This means that the scientistM should be able
to converge to a single code on φh0(x) for sufficient large x, which would mean that the scientist
should only change its mind finitely many times. However this does not happen since the guard
of the While cycle is constantly failing and the prefixes are updated on the following If clauses. In
summary,M fails to Ex⋆-identify a function in S.
2. The while loop does not terminate for a certain value m. Then both lim σ2m+1 and lim σ2m+2
are total functions. According to Proposition 2.2.6, we have that φh(2m+1)(x) = h(2m + 1) and
φh(2m+2)(x) = h(2m + 2), for every x ≥ x0 and for some order x0, and so by definition we have
that φh(2m+1), φh(2m+2) ∈ S. However, they will not be distinguished by the scientist M, since
M(σ2m+1) = M(σ0) = M(σ2m+2). This equality can be achieved for a value of m as big as we
want it to be, and thusM is a scientist that does not distinguish between two functions of S.
We then conclude thatM cannot Ex⋆-identify the set S, and so S ∉ Ex⋆ and consequently R ∉ Ex⋆.
This means that we still haven’t reached an identification class that contains R, which obliges us to
develop the Bc hierarchy even further to reach R. This can be achieved by joining the two ways the
identification criterion is weakened:
Definition 2.2.17. Bcn-identification
A scientistM Bcn-identifies a recursive function ψ ∈R if there exists an order p ∈ N such that, for any
t ≥ p, we have that φM(ψ[t]) =n ψ, i.e. from a certain order the scientist outputs a code for an n-variant
of ψ, but not necessarily the same one.
Definition 2.2.18. Bc⋆-identification
A scientistM Bc⋆-identifies a recursive function ψ ∈R if there exists an order p ∈ N such that, for any
t ≥ p we have that φM(ψ[t]) =⋆ ψ, i.e. from a certain order the scientist outputs a code for a ⋆-variant of
ψ, but not necessarily the same one.
Definition 2.2.19. Bc,Bcn and Bc⋆ are the classes of sets that are Bc-, Bcn- or Bc⋆-identifiable, re-
spectively.
To show that there exists a hierarchy between these classes of sets just like in the Exn classes, we
will first define the following sets of functions:
Definition 2.2.20. The set Sn with n ∈ N is the set of functions ψ ∈ R such that for all but finitely many
i ∈ N, φψ(i) =n ψ, i.e. for all but finitely many i ∈ N, ψ(i) is the code of a n-variant of ψ.
The set S⋆ with n ∈ N is the set of functions ψ ∈ R such that for all but finitely many i ∈ N, φψ(i) =⋆ ψ,
i.e. for all but finitely many i ∈ N, ψ(i) is the code of a ⋆-variant of ψ.
Just like in observing that S ∈ Bc, the same reasoning can be applied to show that Sn ∈ Bcn; we
just need to consider the scientist that outputs the last value read in the input. For showing that the set
16
Sn+1 is not Bcn-identifiable, we will make use of the operator defined in Algorithm 3 to show that there
are functions in Sn+1 that a (total) scientistM that witnesses the Bcn-identification of functions cannot
identify. Let Ln+3(y) = (q, `0, . . . , `n, z) ∈ Nn+3 ∶ y < q < `0 < ⋯ < `n < z, a set of (n + 3)-ordered tuples of
positive integers. The value of q refers to a step extension of the domain of the functions in construction;
the values `0, ..., `n are tentative points of convergence of some function of codeM(σj), for σj ∈ SEG;
finally the value z is the number of steps of computation of program codeM(σj) allowed in the current
tentative of convergence on the inputs `0, ..., `n.
Algorithm 3 Recursive operator Φ used to prove the separation between Bcn and Bcn+1
function Φ(h ∶ N→ N; (k, x) ∶ N)∶ Nvar σ ∶ N→ SEG; i, j, q, y, y′, z, s, `0, . . . `n ∶ N;
y ∶= 0;σ0 ∶= ε;for [1] j ∶= 0 to +∞ do
for [2] (q, `0, . . . , `n, z) ∈ Ln+3(y) in lexicographical order doσj ∶= σ0 (y + 1, hj) ⋯ (y + q, hj);if [1] j = k and y + 1 ≤ (k, x) ≤ y + q then return hj ;
if [2] M(σj)(`0)z and ... and M(σj)(`n)z then exit for[2]end if[2]
end if[1]end for[2]y′ ∶= maxy + q, `n;σj ∶= σj (y + q + 1, hj) ⋯ (y′, hj);for [3] i ∶ y < i ≤ y′ do
if [3] i ∉ `0, . . . , `n then σ0(i) ∶= σj(i)else if φM(σj)(i) ≠ h0 then σ0(i) ∶= h0else σ0(i) ∶= σj(i);end if[3]
end for[3]for [4]s ∶= 1 to j − 1 do
σs ∶= σs (y + 1, σ0(y + 1)) ⋯ (y′, σ0(y′));end for[4]if [4] (k, x) ≤ y′ and k < j then return σk((k, x));end if[4]y ∶= y′;
end for[1]end function
By providing as input to the operator the function h whose existence is guaranteed by Proposition
2.2.6, the recursive operator has some characteristics that allow us to conclude some results. In the
first place, σ0, σ1, ..., σj , ... ∈ SEG are all prefixes of graphs of functions. Whenever the for[2] loop is
interrupted each prefix σk (from k = 0 to k = j) is extended in the external for[1] loop in the following way:
the program codeM(σj) is executed z steps on inputs `0, ..., `n inside the for[2] loop; eventually, for
some tuple (q, `0, . . . , `n, z), the search is successful and the loop is interrupted. After all the successive
executions of the for[2] loop are computed, the domain of the function σ0 is extended in the for[3] loop,
from 0, . . . , y to 0, . . . ,maxy + q, `n. The for[4] loop makes all the functions σk defined so far (for
k = 1 to k = j − 1) to follow the values of σ0. Note that, whenever the for[2] loop is non-terminating at
the final step j, σ0, σ1, ..., σj−1 are graphs of functions with finite domain, but limσj will always be a text
for a total function. With this in mind we have two cases to consider:
17
1. If there is a value for j ∈ N such that the for[2] loop does not halt, we have that the function
φhk= h0 h1 . . . hj−1 hj hj hj ⋅ ⋅ ⋅ = lim σj will be a total function and φhk
∈ S ⊂ Sn+1. Moreover, the
scientistM is not able to Bcn-identify φhksince the conjecture it provides fails to converge on at
least n + 1 input values. Thus the scientist is not able to Bcn-identify a function in Sn+1.
2. If the for[2] loop in the algorithm always terminates for all j ∈ N. This means that for every j ∈ N,
lim σj is a total function. Furthermore, limσ0, which is constructed over the successive for[3]loops computed, is a text for the function Φ(h)((k, x)). In this case, all the values of the total
increasing function h described in Proposition 2.2.6 are codes of (n+ 1)-variants of h. This means
that for some order x0, for every x ≥ x0, M(limσ0[x]) provides codes hj of the function lim σ0
such that φhj = lim σj differs from h in n + 1 values and thus the scientist M cannot Bcn-identify
lim σ0 ∈ Sn+1, identifying instead lim σj .
We thus conclude that for any n ∈ N, Sn+1 ∉ Bcn. It follows that S⋆ ∈ (Bc⋆∖⋃n∈NBcn): if S⋆ ∈ ⋃n∈NBcn
then there would exist a value of n such that S⋆ ∈ Bcn and in particular we would have Sn ∈ Bcn and
Sn+1 ∈ Bcn, which is not possible. This allows us to conclude that R ∉ Bcn for all n ∈ N. We can also
make the following statement:
Proposition 2.2.7. (Case and Smith [7, 8]) Bc ⊂ Bc1 ⊂ Bc2 ⊂ ⋅ ⋅ ⋅ ⊂ Bcn ⊂ Bcn+1 ⊂ ⋅ ⋅ ⋅ ⊂ Bc⋆.
The same way as in Proposition 2.2.4, we see that as we increase the number of errors allowed
(maintaining them at a finite number) we enhance the learning capability of the scientists. The question
remaining to be answered is if the hierarchy ends at Bc⋆ and R ∈ Bc⋆. Lets consider a binary function f
that receives a prefix σ ∈ SEG for a function ψ and a value x ∈ N and outputs a conjecture for the value
of ψ(x). For that function, let m(i) ≤x denote the output resulting of applying the program with code
m to input i when it halts within x steps of computation; if it doesn’t halt, then it outputs .
Algorithm 4 Function that for a given prefix σ for a function ψ and a value x ∈ N outputs the value thatthe scientist thinks the function ψ has when applied to x.
function f (σ ∶ SEG;x ∶ N) ∶ Nfor m ∶= 0 to ∣σ∣ do
τ = ∅for i ∶= 0 to ∣σ∣ − 1 do
τ ∶= τ (i,m(i) ≤x)end forif τ = σ then return m(x)end if
end forreturn 0
end function
For a big enough value of ∣σ∣, it is certain that the code for the wanted ψ with prefix σ is between
0 and ∣σ∣. However, convergence is only guaranteed if we have a large enough value for x, because
otherwise we risk having situations for which m(i) ≤x does not halt. So, for large values of x and
∣σ∣, generally x >> ∣σ∣, it is possible to find a value m such that m(i) ≤x converges in every entry i
between 0 and ∣σ∣ − 1. This means that there exists an order p such that for x >> p, f(σ,x) converges to
18
ψ(x). By applying the s − 1 − 1 Theorem, we then have that there is a computable function s such that
f(σ,x) = φs(σ)(x) = ψ(x) for all x greater than p. This means that φs(σ) =⋆ ψ, i.e. s(σ) is a computable
code of a ⋆-variant of ψ. This code is not unique and may vary depending on the value of ∣σ∣: as ∣σ∣increases, the value for m such that the program m converges for all values of i in x steps may change
and so does the code s(σ). By constructing a scientistM that receives σ and outputs the value for s(σ),we have a scientist that Bc⋆-identifies ψ. Since this reasoning is valid for all ψ ∈ R, then all recursive
functions can be Bc⋆-identified.
Proposition 2.2.8. R ∈ Bc⋆.
We then reached the full power of identification by scientists. This means that the entire set of
recursive functionsR can only be identified by permitting a scientist to change its conjecture an unlimited
number of times and by allowing each conjecture to have finitely many errors.
In the next chapter, we will focus on a subclass of R, the primitive recursive functions, that we will
see is easier to identify and upon which we will develop a scientist.
19
20
Chapter 3
The Search Procedure
Now that we know that every recursive function can be identified (at least semantically and allowing
finitely many errors in its identification) we will return to the study of empirical laws to question ourselves
how far do we need to go to identify the expressions that represent these laws.
3.1 Primitive Recursive Functions
By analyzing the format of the empirical laws known to this date, we observe that it is extremely difficult
to conceive a natural law that cannot be represented by a primitive recursive function, a subclass of
the class of functions R (due to the fact that the examples given for functions that are recursive but
not primitive recursive, like the Ackermann function (see [12]), are extremely complex to define and,
in practise, have never been a subject of the study by the natural sciences, only by the theoretical
mathematical field of computability), so if we want to study the possibility of the natural laws to be
learned by a computational scientist we only need to focus our attention to this subclass of functions.
We now present a formal definition for the class of primitive recursive functions.
Definition 3.1.1. Primitive Recursive Functions
The primitive recursive functions are those inductively defined by the following rules:
1. The 0−ary constant function 0 is primitive recursive.
2. The 1-ary successor function S, defined by the expression S(x) = x + 1, is primitive recursive.
3. For any n ∈ N and for i ∈ N such that 1 ≤ i ≤ n, we have that the function Pn,i defined by the
expression Pn,i(x1, . . . , xn) = xi is primitive recursive.
4. Given a k−ary primitive recursive function f and k many m−ary primitive recursive functions
g1, . . . , gk, the function h resulting from the composition of these functions, defined by the ex-
pression h(x1, . . . , xm) = f(g1(x1, . . . , xm), . . . , gk(x1, . . . , xm)), is primitive recursive.
5. For a k−ary primitive recursive function f and a k+2−ary primitive recursive function g, the k+1−ary
21
function h defined as
h(x1, . . . , xk, y) =⎧⎪⎪⎪⎪⎨⎪⎪⎪⎪⎩
f(x1, . . . , xk), y = 0
g(x1, . . . , xk, y − 1, h(x1, . . . , xk, y − 1)), otherwise
is primitive recursive.
In fact, if we add a sixth rule to the ones in Definition 3.1.1 we can obtain the set R of the recursive
functions. That rule is the following:
6. Let f be a k + 1−ary recursive function. Thus, the k−ary function g defined by the expression
g(x1, . . . , xk) = µyf(x1, . . . , xk, y) = z if f(x1, . . . , xk, z) = 0 and for i < z, f(x1, . . . , xk, i) ≠ 0 is also
recursive.
There is another way we can identify the class of primitive recursive functions. To do so we first need
to understand how a program P in language X (whatever that language is) is developed. It is done by
encoding a sequence of instructions that can be simple assignments (for example x ∶= 0, x ∶= y, x ∶= y+1),
conditionals (if guard then ... else ...), for-loops (for i = 1, . . . , y do ... where i never resets) or
while cycles (while guard do ...). We call a program loop-program if it can be built using a sequence
of assignments, conditionals and for-loops. With these in mind, we present the following statement:
Theorem 3.1.1. (see [30] and [18]) The primitive recursive functions are exactly those computed by
loop-programs, i.e. the programs that can be written without while cycles.
The loop-programs built to compute a primitive recursive functions can use either sequenced and/or
nested for-loops. In fact, the number of nested for-loops is one way to measure the structural com-
plexity of a loop-program: by defining recursively the classes Ln such as L0 is the class of loop-free
straight line programs and Ln+1 the class of loop-programs in which every for-loop is of the form for
i = 1, . . . , y do P , where P ∈ Lm with m ≤ n, we obtain a hierarchy Ln ∶ n ∈ N of loop-programs. In [18]
we can see that this hierarchy allows us to understand the power of nested for-loops; in fact, the class
L2, i.e. the class of functions with at most two nested for-loops corresponds to the so-called class of
elementary functions. These functions can be described as the ones obtained by iteration of the opera-
tions of ordinary arithmetic, which means that although simple in terms of structural complexity (since it
is only needed at most two nested for-loops to compute them), they are a very important subset of the
primitive recursive functions. We can define this class as follows:
Definition 3.1.2. (see [12]) Elementary Functions
The set E of elementary functions is the smallest class such that:
1. the functions x + 1, Pn,i (1 ≤ i ≤ n), x .− y 1, x + y and xy are in E ;
2. E is closed under composition;
3. E is closed under the operations of forming bounded sums and bounded products (i.e. if f(x, y) is
in E then so are the functions ∑z<y f(x, z) and ∏x<y f(x, z)).1The operator .− corresponds to the subtraction for the natural numbers, i.e. we have that x .− y = maxx − y,0
22
3.2 Notation for identification
Now that we saw how to define a primitive recursive function, some notation will be introduced in order
to facilitate the identification of these functions. We will introduce the concept of description that can be
used to denote any recursive function.
Definition 3.2.1. Description
A description of a recursive function is an expression inductively defined by the following rules:
1. The symbol Z() is a 0-ary description that describes the constant 0.
2. The symbol S() is a 1-ary description that describes the successor, i.e., the 1-ary function with the
expression S(x) = x + 1.
3. The symbol P(n,i), for any n and i such that 1 ≤ i ≤ n is an n-ary description that describes the
i-th projection, i.e., the n-ary function defined by the expression Pn,i(x1, . . . , xn) = xi.
4. If G is a k-ary description, with k ≥ 0, that describes the function g and if H 1, . . . , H k are n-
ary descriptions that describe the functions h1, . . . , hk respectively, with n ≥ 0, then C(G,[H 1,
..., H k]) is an n-ary description that describes the n-ary function f defined by the expres-
sion f(x1, . . . , xn) = g(h1(x1, . . . , xn), . . . , hk(x1, . . . , xn)). We say that f is obtained from g and
h1, . . . , hk by composition.
5. If G is an n-ary description with n ≥ 0 that describes the function g and H is an (n + 2)-ary de-
scription that describes the function h then R(G,H) is an (n + 1)-ary description that describes the
(n + 1)-ary function f recursively defined by the expressions f(x1, . . . , xn,0) = g(x1, . . . , xn) and
f(x1, . . . , xn, y + 1) = h(x1, . . . , xn, y, f(x1, . . . , xn, y)). We say that f is obtained from g and h by
primitive recursion.
6. If G is an (n + 1)-ary description that describes the function g then M(G) is an n-ary description
that describes the function f defined by the expression f(x1, . . . , xn) = µy.g(x1, . . . , xn, y). This
function outputs the least value for y such that g(x1, . . . , xn, y) = 0 and for z < y, g(x1, . . . , xn, z) > 0.
We say that f is obtained from g by minimization.
Definition 3.2.2. The size of a description D is given by the number of occurrences of all the symbols Z,
S, P, C, R and M in D.
Each n-ary description describes a unique n-ary recursive function. However, several descrip-
tions can describe the same recursive function; for example if D describes a recursive function then
C(P(1,1),[D]) describes the exact same function.
In order to identify a primitive recursive function we can observe its description: a recursive function
is primitive recursive if it has a description built using the rules 1 to 5 defined in Definition 3.2.1; in
other words, a primitive recursive function is a recursive function that has a description defined by an
expression written without the symbol M.
23
We will now analyze a few basic primitive recursive functions, in order to deduce a description (with-
out the symbol M) for each one.
Example 3.2.1. First we will look into a simple function. Let zero ∶ N → N be the function defined by the
expression zero(x) = 0.The first step is to write zero recursively. This definition can be achieved through
the functions g ∶ N0 → N and h ∶ N2 → N defined by the respective expressions:
g ≡ 0 (3.1)
h(x, y) = y (3.2)
These functions are used to define zero recursively as follows: zero(0) = g ≡ 0 and zero(x + 1) =h(x, zero(x)) = zero(x). We can easily observe through equation (3.1) that g has as description the
expression Z() and, by equation (3.2), h has as description the expression P(2,2). Thus, using rule 5
of the Definition 3.2.1, we reach for the description of zero the expression R(Z(),P(2,2)).
Example 3.2.2. Let pred ∶ N→ N be the predecessor function. This function is defined by the expression
pred(x) = x .−1, where the operator .− corresponds to the operation of subtraction for the natural numbers.
In order to be easier to write the description for this function we can define it recursively by the functions
g ∶ N0 → N and h ∶ N2 → N, respectively defined by the expressions
g ≡ 0 (3.3)
h(x, y) = x (3.4)
We already know a description for equation (3.3), Z(). For equation (3.4) it is easily observed that h
is the projection function of the first element of the input pair and thus its description is given by P(2,1).
Consequently, the description for the function pred, which is recursively defined by the expressions
pred(0) = g ≡ 0 and pred(x + 1) = h(x, pred(x)) = x, is given by R(Z(),P(2,1)).
Example 3.2.3. We will show that the addition function add ∶ N2 → N defined by the expression
add(x, y) = x + y is primitive recursive. First, we need to define function f recursively, which can be
done by functions g ∶ N→ N and h ∶ N3 → N defined by the expressions
g(x) = x (3.5)
h(x, y, z) = z + 1 (3.6)
We can now write a description for both functions in equations (3.5) and (3.6):
• For equation (3.5), it is easily understood that the expression that describes the function g is
P(1,1).
• For equation (3.6), we can also see that h can be defined as the successor of the third element of
the triple given as input, and thus its description is the expression C(S(),[P(3,3)]).
24
By rule 5 in the Definition 3.2.1, we can safely assume a description for function add: since it is recur-
sively defined by add(x,0) = g(x) = x and add(x, y + 1) = h(x, y, add(x, y)) = add(x, y)+ 1 a description is
given by the expression R(P(1,1),C(S(),[P(3,3)])).
Example 3.2.4. We will now analyze the difference function natminus ∶ N2 → N defined by the ex-
pression natminus(x, y) = x .− y, where .− is the operation of subtraction for the natural numbers. Lets,
once again, write the expressions of the functions g ∶ N → N and h ∶ N3 → N that will be used to define
recursively natminus:
g(x) = x (3.7)
h(x, y, z) = z .− 1 (3.8)
We are now under conditions of writing a description for both functions in equations (3.7) and (3.8):
• A description for equation (3.7) is, obviously, P(1,1).
• By observation of the definition of h, it is obvious that this function is obtained by applying the
predecessor to the third element of its argument. Thus, an expression for a description of h is
C(R(Z(),P(2,1)),[P(3,3)]).
It is now easy, by using rule 5 once more, to reach a description for this function, which is given by
the expression R(P(1,1),C(R(Z(),P(2,1)),[P(3,3)])).
Example 3.2.5. Let prod ∶ N2 → N be the product function defined by the expression prod(x, y) = x × y.
The functions g ∶ N → N and h ∶ N3 → N that will define recursively prod are defined by the respective
following expressions:
g(x) = 0 (3.9)
h(x, y, z) = z + x (3.10)
Now we write a description for both functions in equations (3.9) and (3.10):
• For equation (3.9), we observe that the function g is the 1-ary zero function, to which we already
deduced an expression for its description in Example 3.2.1: R(Z(),P(2,2)).
• For equation (3.10), it is easy to observe that h is defined by the addition of the first and the third
elements in its argument. We already have description for the addition from Example 3.2.3. Thus,
we can easily obtain a description for h: C(R(P(1,1),C(S(),[P(3,3)])),[P(3,1),P(3,3)]).
By rule 5 in the Definition 3.1.1, and because we can define recursively prod using the expressions
prod(x,0) = x and prod(x, y + 1) = h(x, y, prod(x, y)) = prod(x, y) + x, we deduce a description for prod:
R(R(Z(),P(2,2)),C(R(P(1,1),C(S(),[P(3,3)])),[P(3,1),P(3,3)])).
Example 3.2.6. Finally, we define the distance function dist ∶ N2 → N as dist(x, y) = ∣x−y∣. To understand
how to write a description for this function, we will write it in a different way: dist(x, y) = (x .− y)+ (y .− x).
25
This means that we can define the distance as the addition of two subtractions. For us to be able to write
the description for this function we first need to deduce the description of the function d ∶ N2 → N defined
by the expression d(x, y) = y.− x, which is the difference function but with the arguments switched.
This means that a description for d is C(R(P(1,1),C(R(Z(),P(2,1)),[P(3,3)])),[P(2,2),P(2,1)]).
Then, by the rules in Definition 3.2.1, a description for the distance function will be the expression
C(R(P(1,1),C(S(),[P(3,3)])),[R(P(1,1),C(R(Z(),P(2,1)),[P(3,3)])),C(R(P(1,1),C(R(Z(),
P(2,1)),[P(3,3)])),[P(2,2),P(2,1)])]).
We present a table summarizing the previous functions and their respective descriptions (Table 3.1).
By remembering Theorem 3.1.1 from [30], this means that for all these functions there is a program
written with sequences of only assignments, if-else conditionals and sequential and/or nested for
loops that computes it.
3.3 The search algorithm
We now have a good definition of the primitive recursive functions and a proper notation to identify them,
so we can now proceed into implementing the search algorithm. However, before doing so, there is a
result we need to present relative to the enumeration of primitive recursive functions: the fact that said
enumeration is actually possible. The proof of that result is given in the following statement.
Proposition 3.3.1. The set of the primitive recursive functions (PRIM) is recursively enumerable.
Proof. Lets start by the notation used when defining descriptions. We can then define a correspondence
between those symbols and the set of natural numbers. Let e be that correspondence. We have the
following definition for e (from [1]):
• e(Z()) = ⟨0⟩.
• e(S()) = ⟨1⟩.
• e(P(n,i)) = ⟨2, n, i⟩.
• e(C(G,[H1, . . . ,Hk])) = ⟨3, k, l, e(G), e(H1), . . . , e(Hk)⟩, where l is the arity of the description.
• e(R(G,H)) = ⟨4, l, e(G), e(H)⟩, where l is the arity of the description.
To understand why to each description corresponds one natural number, we see that there exists a
bijective function that encodes a tuple of natural numbers into one and only one natural number: the
function τ ∶ ∪k>0Nk → N in [12] such that τ(a1, . . . , ak) = 2a1+2a1+a2+1+2a1+a2+a3+2+⋅ ⋅ ⋅+2a1+a2+⋅⋅⋅+ak+k−1−1.
However, it is not enough to perform the inverse correspondence to obtain an enumeration, since not
every natural number will be obtained from this e, i.e. this correspondence is injective but not bijective.
To resolve this problem we define that for every natural number not in the range of e, the output of the
enumeration is the constant function 0, i.e. the one denoted by Z(). This way, we have a well defined
and exhaustive enumeration for the primitive recursive functions.
26
Func
tion
Des
crip
tion
zero(x)=
0R(Z(),P(2,2)
pred(x
)=x
. −1
R(Z(),P(2,1)
add(x,y
)=x+y
R(P(1,1),C(S(),[P(3,3)]))
natminus(x,y
)=x
. −y
R(P(1,1),C(R(Z(),P(2,1)),[P(3,3)]))
prod(x,y
)=x×y
R(R(Z(),P(2,2)),C(R(P(1,1),C(S(),[P(3,3)])),[P(3,1),P(3,3)]))
dist(x,y
)=∣x−y∣
C(R(P(1,1),C(S(),[P(3,3)])),[R(P(1,1),C(R(Z(),P(2,1)),[P(3,3)])),C(R(P(1,1),C(R(Z(),P(2,1)),[P(3,3)])),[P(2,2),P(2,1)])])
Tabl
e3.
1:P
rimiti
vere
curs
ive
func
tions
and
thei
rcor
resp
ondi
ngde
scrip
tions
27
Proposition 3.3.2. PRIM ∈ Ex.
Proof. To prove that the primitive recursive functions are Ex-identifiable we simply present a scientist
that Ex-identifies this set of functions (Algorithm 5, in [10] and [9]). Let ρ ∶ N → PRIM be an enumer-
ation for the primitive recursive functions, ρi denote the function obtained from ρ(i) and ψ the primitive
recursive function explained by a text in the canonical form with prefix σ.
Algorithm 5 Scientist that Ex-identifies PRIMfunctionM(σ ∶ SEG)∶ N
i ∶= 0;k ∶= 0;n ∶= length of σwhile k < n do
if ρi(k) = ψ(k) then k ∶= k + 1else
k ∶= 0;i ∶= i + 1;
end ifend whilereturn i
end function
This algorithm proceeds in the following way: it searches for the least i ∈ N such that the output of
ρi applied to every k < n has the same value as the output of ψ applied to k (which is obtainable in σ).
Since ρ is an enumeration for the set PRIM, then this algorithm will not overlook any primitive recursive
function and since σ is a prefix of a text that explains a primitive recursive function, this algorithm will
eventually halt.
We will then focus our attention into the class of primitive recursive functions and into developing a
scientist that Ex-identifies said class.
Since we know that the enumeration of the primitive recursive functions is possible, we can base the
materialization of the search program on the fact that the primitive recursive functions can be enumer-
ated. We can observe that program in Algorithm 6 which we will proceed to explain.
Let Functions be a function that given a natural number i lists the descriptions with size i. Then
for a certain size, we list the descriptions with said size and, for each of these descriptions, we check
if the result of the function defined by the description applied to each element in the inp list is equal
to the correspondent element in out (both inp and out are provided as input to the search procedure
in Algorithm 6). If it is equal for all these elements, then we have found our function and we terminate
the program by returning that description. If one of the comparisons is not true, then we proceed to the
following description in the list and if no description verifies every comparison, we construct another list
whose descriptions have the size of the previous ones incremented by one. We thus have a defined
search algorithm whose computability depends on the implementation of the listing function Functions.
28
Algorithm 6 Search algorithm for a primitive recursive function given the input/output valuesInput: inp as the list of tuples with the input values; out as the list of integers with the corresponding
output valuesOutput: a description of a primitive recursive function
procedure SEARCH(inp,out)i ∶= 1for i ∈ N do:
for f in Functions(i) doif f .arity ! = length(inp[0] then)
t ∶= Falsecontinue
elset ∶= Truej ∶= 1for j < length(inp) do
t ∶= (f(inp[j]) == out[j])if t is False then
breakend if
end forif t is True then
res ∶= fbreak
end ifend if
end forif t is True then
breakend if
end forreturn res
end procedure
3.4 A first enumeration
An implementation for the function Functions, that given a size lists the descriptions of that size, and
the necessary auxiliary functions is present in Algorithm 7 (adapted from [13]):
• Functions is the main function. It receives the descriptions’ size and outputs a list with all the
possible descriptions with that size. It does so by, when the size is 1, yielding the descriptions Z(),
S() and the descriptions for all projections with arity up to 3. If the size is 0 it passes and then calls
two auxiliary functions to construct the rest of the descriptions: Compositions and Recursions
which we will describe later.
• Functions With Maxsize will return a list with all the descriptions with every size from 1 to the
input value, together with the difference between the input value and the size of each description.
• Composition Function Lists will receive the length that the outputed list of descriptions should
have, the maximum size that all those descriptions combined will be able to have and, optionally,
the arity for the descriptions in the list; then it will construct all the combinations of descriptions that
29
Algorithm 7 Construction of a function list composed by functions with a given description size
function FUNCTIONS(size)if size ≤ 0 then passelse if size == 1 then yield Z(); yield S()
for i ∶= 1 to 3 dofor j ∶= 1 to i do yield P(i,j)
end forend for
elsefor composition in Compositions(size) do yield compositionend forfor recursion in Recursions(size) do yield recursionend for
end ifend function
function FUNCTIONS WITH MAXSIZE(size)for subsize ∶= 1 to size do
for func in Functions(subsize) do yield func, size-subsizeend for
end forend function
function COMPOSITION FUNCTION LISTS(length, size, arity = None)if length = 0 then
if size = 0 then yield []else passend if
elsefor function, remaining size in Functions With Maxsize(size) do
if arity = function.arity or arity == None thenfor sublist in Composition Function Lists(length-1,remaining size,function.arity) do
yield [function] + sublistend for
end ifend for
end ifend function
function COMPOSITIONS(size)for g, after g size in Functions With Maxsize(size−1) do
if g.arity > 0 thenfor function list in Composition Function Lists(g.arity,after g size) do
yield C(g,function list)end for
end ifend for
end function
function RECURSIONS(size)for function, size2 in Functions With Maxsize(size-1) do
for function2 in Function(size2) doif function2.arity == function.arity +2 then yield R(function, function2)end if
end forend for
end function
30
verify the given length and size and such that their arity is the same (if an input for arity is given
then only the functions with that arity will be outputed).
• Compositions will yield every possible description whose first symbol is C that has the given size.
These descriptions will be constructed using the lists obtained with Composition Function Lists
as the second argument of the description and such that the arity of the function in the first argu-
ment is the same as the length of the list. Moreover, the sum of the sizes of all the descriptions
in both arguments must be the size given decremented by one unit. In this case, we do not need
to worry about the arity of the main description since this procedure will cover every description of
every possible arity for the given size.
• Recursions will yield every possible description whose first symbol is R that has the given size and
where the first description in its argument has as arity the arity of the second description in the
argument plus two. Moreover, the size of both these descriptions must be the given size minus
one. Once again, we don’t need to be concerned about the arity of the main description due to the
same reason: this procedure will cover all descriptions of every possible arity for the given size.
With this in mind is easy to understand the behaviour of Functions and how it will yield the descrip-
tions of a given size. However, it still remains to explain the peculiarity of the projection function. Since
the arity of the projection symbol does not interfere with its size, we needed to establish an upper bound
for this parameter in order to be certain that the computation of the list of descriptions halts. It was de-
cided, for now, that that bound would be established at 3 because our goal was to explore this algorithm
mainly with unary and binary functions, and it is possible to express the most basic functions with said
arities only with projections up to arity 3 (as it can be seen in the Table 3.1); in case of need this bound
can be incremented or decremented. This decision aims to keep the length of the descriptions’ list to an
efficient dimension (since its construction is combinatorial) thus reducing the time needed to construct
said list.
In Figure 3.1 we can see examples of the lists of functions whose descriptions have size from 1 to 4.
It is visible the construction process for the descriptions, executed by combining smaller descriptions to
produce bigger ones. We can also see that the size of these lists increases very fast, which is explained
by the fact that the descriptions are constructed using a combinatorial reasoning. Note that there aren’t
any descriptions with size 2. This happens because the symbols used to construct the descriptions by
combining expressions for other descriptions (namely C and R) always need at least two more symbols,
and thus the descriptions we construct using these symbols have at least size 3.
We now have a functional scientist for the primitive recursive functions with arity up to 2 and thus are
within the conditions to experiment with the search algorithm.
3.5 An improved enumeration
Having a functioning enumeration is not all that we want. In fact, we want our search to be as efficient
as possible. Thus, our next step is to develop an enumeration that streamlines the search procedure.
31
Figure 3.1: List of all the functions whose descriptions have the referred size
It is possible to define an enumeration of the primitive recursive functions without repetition starting
from one that is exhaustive (proofs seen in [20] and [26]); although, constructing this enumeration is
“highly inefficient” (see [20]) and so we will just focus on trying to ameliorate the enumeration previously
implemented into one that is more efficient.
In our work, we will try to identify a primitive recursive function through input and correspondent
output data, which makes it possible for us to know the arity of the function we want a priori. This means
that we can make some changes in the search algorithm (Algorithm 6) and more importantly in the listing
of the primitive recursive functions (Algorithm 7) to have into account the arity of the function we want to
discover, listing only the functions of said arity, thus performing the search only in that set of functions.
This search procedure can be observed in Algorithm 8.
If we compare the Algorithms 6 and 8, we see that the changes are simple to identify: instead of
checking if the arity is the correct one after we have the list of functions (and discarding those that
don’t have it), we introduce this parameter into the arguments of the enumeration function Functions so
that the functions we are going to search through are already those with the correct arity. This implies a
bigger change in the algorithm that performs the enumeration of the functions, which we see in Algorithm
9. Besides this improvement, we can still reduce some redundancies in the enumeration of the primitive
recursive functions. If we prevent the existence of repetitions in the list of functions in the second
argument of a description with main symbol C, we will still have exhaustiveness when it comes to listing
descriptions for every primitive recursive function (within a certain arity, since that restriction is already
being taken into account). That prevention of repetition is made by the function inList used as auxiliary
function in the definition of Composition Function Lists, and defined in Algorithm 10. Analyzing it, we
32
Algorithm 8 Search algorithm for a primitive recursive function given the input/output values having intoaccount the arity
Input: inp as the list of tuples with the input values; out as the list of integers with the correspondingoutput values
Output: a description of a primitive recursive function
procedure SEARCH(inp,out)i ∶= 1a ∶= length(inp[0])for i ∈ N do:
for f in FUNCTIONS(i, a) dot ∶= Truej ∶= 1for j < length of inp do
t ∶= (f(inp[j]) == out[j])if t is False then
breakend if
end forif t is True then
res ∶= fbreak
end ifend forif t is True then
breakend if
end forreturn res
end procedure
can see that this function will compare the description we want to add to a list of descriptions with every
description already in that list. That specific comparison will be made primarily through the main symbol
of the descriptions we are comparing:
• if they are not the same, then the descriptions are different;
• if they are the same and the main symbol is S or Z, then they are equal;
• if they are the same and the main symbol is P, then we need to check if the arguments n and i are
the same. If they both are, then the descriptions are equal but if one of them is not the same then
they are different;
• if they are the same and the main symbol is R, then we will apply recursively the function same to
the first elements of the two descriptions and then to the second ones as well;
• if they are the same and the main symbol is C, then we will apply the function same to the first
arguments of the description, we will check if the length of the list of functions in the second
argument of both descriptions is the same and we will apply the function same to the elements of
both lists pair by pair.
Returning to Algorithm 9, we will describe the changes we made from Algorithm 7. A general change
happens in the arguments of the functions: every function will have as argument the arity of the function
33
Algorithm 9 Construction of a function list composed by functions with a given description size and arity
function FUNCTIONS(size,ar)if size ≤ 0 then passelse if size == 1 then
if ar == 0 then yield Z()
else if ar == 1 then yield S()
end iffor i ∶= 1 to ar do yield P(ar,i)
end forelse
for composition in Compositions(size,ar) do yield compositionend forfor recursion in Recursions(size,ar) do yield recursionend for
end ifend function
function FUNCTIONS WITH MAXSIZE(size,ar)for subsize ∶= 1 to size do
for func in Functions(subsize,ar) do yield function, size-subsizeend for
end forend function
function COMPOSITION FUNCTION LISTS(length, size, ar)if length = 0 then
if size = 0 then yield []else passend if
elsefor function, remaining size in Functions With Maxsize(size,ar) do
for sublist in Composition Function Lists(length-1,remaining size,ar) doif not inList(function,sublist) then: yield [function] + sublistend if
end forend for
end ifend function
function COMPOSITIONS(size,ar)for i ∶= 1 to size−2 do
for j ∶= 1 to size−2 dofor function list in Composition Function Lists(i, j,ar) do
for g in Functions(size−j − 1, i) do yield C(g,function list)end for
end forend for
end forend function
function RECURSIONS(size)for function, size2 in Functions With Maxsize(size−1, ar−1) do
for function2 in Functions(size2, ar+1) do yield R(function, function2)end for
end forend function
34
Algorithm 10 Function that indicates if a description is already in a list of descriptions
function INLIST(obj,objlist)b ∶= Falsefor obja in objlist do
b = same(obj,obja)if b == True then breakend if
end forreturn b
end function
function SAME(obj1,obj2)if type(obj1)! =type(obj2) then return Falseelse
if obj1 is S or Z then return Trueelse if obj1 is P then
if obj1.i ! = obj2.i or obj1.n ! = obj2.n then return Falseelse return Trueend if
else if obj1 is C or obj1 is R thenif same(obj1. g,obj2. g) and same(obj1. h,obj2. h) then return Trueelse return Falseend if
end ifend if
end function
we want to find. This will allow us to eliminate any operation regarding the verification of conditions
concerning the arity of the functions we are listing. We will now move on to analyzing the changes
specific to each function:
• In Functions we observe that, since we know the arity of the function we want, we don’t need
to list every description with size 1. Thus, if the arity is 0 we yield Z() and if it is 1 we yield S().
Besides this, but still for size 1, we will yield every description of every projection with the arity
given, which means that we will no longer need a constant boundary for the arity of the projection
function like we needed in Algorithm 7 (the boundary was 3 as explained before).
• In Functions With Maxsize there was no need to make further changes.
• For Composition Function Lists, argument of arity is now mandatory instead of optional. That
means that when we make the recursive call of the function, we know which value for the arity
we will provide, and thus we don’t need to check if the arity of the function resulting from applying
Functions With Maxsize is the correct one or not. Furthermore, we also introduced the function
inList to prevent the redundancies, as explained previously.
• Compositions is the function that demanded more changes. Because Functions With Maxsize
now needs as input a value for the arity of the function we are now prevented from finding the g
function first since the arity of g depends on the length of the list of functions h which we still don’t
know, which means that we will need to construct that list first. For that we need to call the function
35
Composition Function Lists; however, that function demands an input for its length and its size.
We can deduce upper boundaries for those attributes: the maximum size for that list will be the
size of the main description (which from now on we will only call size) minus the size of the symbol
C which stands at the head of the description (1) and minus the minimum size of the description of
function g (which has minimum size 1). Thus, the maximum size of the list of functions is size − 2.
Regarding the length of the function list, since its maximum size is size−2, its maximum length will
happen when every description is size 1 and so the maximum length will also be size − 2. For any
list of descriptions obtained through the previous procedure, we will combine it with every possible
and adequate description for function g. That will be done by calling function Functions, given as
size input size−j−1, where j is the value of the sum of the sizes of the descriptions in the list (since
size − j − 1 is the size the description for g has to have so that the sum of the sizes corresponds
to the initial size given as input), and as arity input the length of the list. In the end it yields the
description obtained by combining all these processes.
• In Recursions, knowing a priori the arity of the description also simplified this procedure, because
we know that the arity of the first description will be the arity of the main description decremented
by one while the arity of the second description will be the one of the main description incre-
mented by one, and so we don’t need to check the arities of the descriptions generated both in
Function With Maxsize and Functions.
As example, we have in Figure 3.2 a few lists of descriptions with small size and arity.
Analyzing these lists, and comparing them with the lists in Figure 3.1, there are some things to
observe. Firstly, there is concordance in both lists regarding the non existence of descriptions with size
2 (the reason to which has already been explored previously). Additionally, if we add the descriptions
by size, in order to be comparable with the lists in Figure 3.1, we see that there are more descriptions
for a certain size in this list than in the previous one. That is a result of the bound for the arity of the
projection description in the first enumeration; since the bound in the second enumeration is the arity
of the description then we have a higher bound (or a equal one for arities less or equal than 3), which
will result in more descriptions. The fact that there are more descriptions in the second enumeration
for each size does not make the following search less efficient as it may seem in first glance. In fact,
since we allow projections with arity bigger than 3, we will, in theory and in some cases, arrive to a
function’s description with lesser size than the one we wound obtain from performing the search with
the first enumeration. Moreover, it is noticeable that there are no repetitions between the descriptions in
each list belonging to a composition description. This is a direct result of applying the function inList in
the enumeration algorithm presented in Algorithm 9.
We now have a (theoretically) ameliorated enumeration to perform the search of a primitive recursive
function through the correspondent search algorithm (Algorithm 8).
36
(a) Size 1 and arity from 1 to 4 (b) Size 2 and arity from 1 to 4
(c) Size 3 and arity from 1 to 3 (d) Size 3 and arity 4
(e) Size 4 and arity from 1 to 3 (f) Size 4 and arity 4
Figure 3.2: Lists of descriptions with the referred size and arities
3.6 From description to code
Finding a description that identifies correctly a function that relates the inputs given with the correspon-
dent outputs is not the end of our work. Our goal is not only to find a function that relates the inputs with
37
the outputs but also to be able to predict outputs for other input values. To do so, we will need to find the
code for a program that computes the function described by the description found.
We have seen before in Theorem 3.1.1 that the set PRIM is exactly the set of functions that have
loop-programs (see [30]). This means that for every primitive recursive function it is possible to write a
program only with a sequence of assignments, if-clauses and sequential and nested for-loops. Thus,
our next step is to obtain said program in Python language for the found description. We developed a
program that will do such thing. This program will use an auxiliary list of variables so that it is possible for
us to perform the attributions needed for the program to function correctly. Plus, by construction, these
varibles will always be of one of two types: tuples or integers. This will be important to have in mind later
when we define the way of making the attributions and the operations regarding the variables. Getting
back to the program, it begins by writing in a text file the commands that define a function in Python:
“def function(x):” and then in the next line with the correct indentation we perform the attribution of
the input of the function to the first element of the list of variables: “a0 = x”. This list of variables will be
updated throughout the execution of this program so that the correct variable is used every time. Plus,
the changes of line and the indentations will also be employed in agreement with the correct ones. Then,
the program’s execution depends on what symbol of the description it reads:
• if the symbol is Z(), then it will attribute to the correct variable the value 0.
• if the symbol is S(), then it will be attributed to a new variable the value of the successor of the
adequate variable. However, before doing this operation we need to make sure that the variable is
an integer and not a tuple; if it is a tuple then it is one of only one element (due to the arities of the
operations in question) and then we say that the new variable is the successor of the first element
of the tuple.
• if the symbol is P(n,i), then the value of a new variable will be the i−th element of the appropriate
variable. Reversely to what happens in the previous case, we need to make sure that the variable
in question is a tuple; if it is not, then it is an integer and so, before performing the projection, we
need to transform the integer variable into a tuple variable and only then realize the projection.
• if the symbol is C(G,[H 1, ..., H k]) then the program will write the correct code for computing
the output of the functions with descriptions H 1,...,H k, attribute those values to the correct
variables, create a tuple variable composed of the different outputs of the previous k functions and
then attribute to the correct value the one computed by applying function with description G to that
tuple of values.
• if the symbol is R(G,H), then this program will first make sure that the input variable is a tuple, then
attributes to a new variable the tuple resulting of deleting the last element of the input tuple. It will
then write the code corresponding to computing function with description G. Finally, it will start a
for-loop that will be executed the same number of times as the value of the last element of the
input tuple and then it will write the code of function with description H with the correct input: the
variable resulting of deleting the last element of the input tuple with the number of the iteration of
38
the for-loop and another variable appended (this variable starts by being the output of function
with description G and then it will be updated as the result of executing this loop one time).
The last two possibilities are, of course, recursive, in the sense that when writing the lines of code for
the functions that are arguments of the symbols in question it will call the main function. In the end, this
program will write a line of code that allows the correct variable to be returned.
We present some examples of codes obtained from descriptions that are developed using this pro-
gram. Note that the descriptions given as example do not necessarily define one of the primitive re-
cursive functions that we identified previously and summarized in Table 3.1; we present very simple
descriptions that allow us to easily understand the operation of the program.
Example 3.6.1. We begin with the simplest of the examples: the code for the description Z(). The code
outputed by the program is given in the Figure 3.3.
Figure 3.3: Code for function with description Z()
We can see that the first two lines are the common beginning for the codes of all descriptions. Line
3 performs the attribution related to the symbol Z in the description and line 4 corresponds to the return
of the correct variable, in this case a1.
Example 3.6.2. The next example is also very simple, since it is the code of a program that computes
the function with description S(). That code can be seen in Figure 3.4.
Figure 3.4: Code for function with description S()
Once again, the first two lines of code correspond to the standard beginning for every program written
following this method. In line 3 we see the verification of the type of the variable a0: if it is not an integer
then is a (unary) tuple and thus the instruction is to add one to the first element of the variable. If it
is an integer, then we see in line 4 the instruction to add one to the variable a0. Lastly, line 5 has the
instrucction to return the variable a1.
Example 3.6.3. We will now see the code of a program that computes the function with description
P(3,1), which is present in Figure 3.5.
Figure 3.5: Code for function with description P(3,1)
39
Here, we see that in line 3, the instruction is to see if a0 is not an integer and then, if it isn’t, to
attribute to the new variable a1 the first element of the variable a0, which has the same value as the
function’s input. If a0 is an integer, then we transform it into a tuple and only then select the intended
element.
Example 3.6.4. Lets now analyze the resulting code for an example of a function with description
C(P(3,2),[P(1,1),S(),S()]) (Figure 3.6).
Figure 3.6: Code for function with description C(P(3,2),[P(1,1),S(),S()])
We can check that lines 3 to 8 correspond to the instructions to perform the computation for the
functions with description P(1,1), S() and S(), which are the descriptions in the list in the second part
of the argument of the symbol C in the original description. Then in line 9 we join these three variables
in a single tuple so that we can use it as input for computing the function whose description is in the first
part of the argument of the symbol C of the description (P(3,2)), and then performing said computation
in lines 10 and 11. Finally, we have the instruction that returns the value of the correct variable (a5) in
line 12.
Example 3.6.5. In Figure 3.7 we have the outputed code for the description R(Z(),P(2,1)).
Figure 3.7: Code for function with description R(Z(),P(2,1))
In line 3 we make sure that from that point on the input is a tuple. Next, we have the attribution of a
new variable as a tuple with the elements of the input tuple except the last one. In line 5, we perform the
attribution respective to the function with description Z(), which is the first argument of the description
symbol R. Then we begin a for loop that will be executed the number of times as the last element of the
input tuple. Inside the loop, in line 7 we perform the merger of the tuples in question and in lines 8 and
9 we perform the computation of the function whose description is the second part of the argument of
the original description, having the update of the correct variable in line 10. Lastly, we return the value
of the correct variable in line 11.
40
Chapter 4
A Restriction to E
We will now explore another way of attacking the problem of the search of expressions that describe
natural laws. We have been focusing on the primitive recursive functions to the learning environment of
our scientist. However it may not be necessary to have an environment of this type since we suspect
that every natural law can be express in a much simpler way: as an elementary function.
4.1 Elementary functions
We have already defined the set E of elementary functions in Definition 3.1.2. In fact, there are functions
that do not need to be expressed in the basis set of E because they themselves can be deduced by
applying the composition, bounded sum and/or bounded product operators to some simpler functions;
these functions are the addition and product operations. The construction of these functions using other
elementary functions is made the following way:
• prod(x, y) = xy = ∑z<y x
• add(x, y) = x + y = ∑z<y(((x.− z) .− xz) + 1)
For the product operation it is easy to understand what happens: we sum x y times. For the addition
it is a little bit more tricky; we want to add 1 to x y times. To do so, we perform a sum bounded by y such
that for z = 0 we have ((x .− 0) .− 0) + 1 = x + 1 and for 0 < z < y we have ((x .− z) .− xz) + 1 = 0 + 1 = 1 since
xz > x .− z for any z < y. This way, we will obtain x + y.
We show that the following functions are elementary by proving that they are the composition of
elementary functions:
• exp(x, y) = xy =∏z<y x
• pred(x) = x .− 1 = x .− (x + 1.− x)
• sg(x) =⎧⎪⎪⎪⎪⎨⎪⎪⎪⎪⎩
0, x = 0
1, x ≠ 0
= x .− (x .− 1)
41
• sg(x) = 1.− sg(x)
• dist(x, y) = ∣x − y∣ = (x .− y) + (y .− x)
• fact(x) = x! =∏z<x(z + 1)
• min(x, y) = x .− (x .− y)
• max(x, y) = (x .− y) + y
There is another operator we need to define, the bounded minimalisation.
Definition 4.1.1. Bounded Minimalisation
Let f be an (n + 1)-ary function such that f ∈ R. Let g be a new (n + 1)-ary function defined by the
expression
g(x1, . . . , xn, y) = µz < y(f(x1, . . . , xn, y) = 0) =⎧⎪⎪⎪⎪⎨⎪⎪⎪⎪⎩
the least z < y s.t. f(x1, . . . , xn, z) = 0 if such z exists
y otherwise
Then g ∈R. We call the operator µz < y a bounded minimalisation.
Lemma 4.1.1. (see [12]) Let f be an (n + 1)-ary function such that f ∈ E . Let g be a new function, also
(n+ 1)-ary, defined by the expression g(x1, . . . , xn, y) = µz < y(f(x1, . . . , xn, y) = 0). Then g ∈ E , i.e. E is
closed under bounded minimalisation.
In fact, we can write the bounded minimalisation as follows:
µz < y(f(x1, . . . , xn, z) = 0) = ∑v<y
∏u≤v
sg(f(x1, . . . , xn, u)) = ∑v<y
∏u<v+1
sg(f(x1, . . . , xn, u))
.
With this in mind, we can prove that the quotient function for integers is also elementary:
qt(x, y) =⎧⎪⎪⎪⎪⎨⎪⎪⎪⎪⎩
⌊ yx⌋, x ≠ 0
0, x = 0
= µz ≤ y(x = 0 or x(z + 1) > y)
= µz ≤ y(x = 0 or sg(x(z + 1) .− y) = 0)
= µz ≤ y(x × sg(x(z + 1) .− y) = 0)
= ∑v<y+1
∏u<v+1
sg(x × sg(x(z + 1) .− y))
We now wonder: if all these functions are elementary, then which primitive recursive functions are
not? To answer that, we present this statement:
Theorem 4.1.1. (see [12]) If f(x1, . . . , xn) is an elementary function, then there is a number k such that
for all x = (x1, . . . , xn),
f(x) ≤ 22⋰2max(x)
where the exponentiation is iterated k times.
42
Corollary 4.1.1. (see [12]) The function
f(x) = 22⋰2x
where the exponentiation is iterated x times is primitive recursive but it is not elementary.
We see that, even though the set of elementary functions is not equal to the set or primitive recursive
functions, the first contains the functions/operations that are indeed used to express most of the mathe-
matical equations that explain the natural laws. This means that we can reduce the scope of our search
to a set that still has everything we need but is much more easier to define: E . So, our next step will be
to develop a scientist that will have as learning environment the set of elementary functions.
4.2 Notation for representation
Once again, since we already have the set E defined we need to define a notation that represents
its functions. Thus, we have adapted the notion of description of primitive recursive function to be
able to represent only elementary functions in a simpler way. But first, we need to establish which
functions/operators are going to be in the base of the inference rules for this notation. We already
know that, following the Definition 3.1.2, we can construct any elementary function from the operations
x + 1, Pn,i, x.− y, x + y and xy and using the composition, bounded sum and bounded product operators.
Furthermore, we have proved that a great number of functions are elementary. This means that we can
consider these functions to be explicitly defined in the definition of notation for the elementary functions.
By choosing some key functions, we can expedite our search by a great amount of time. By analysing
the expressions of the functions written using the expressions of other elementary functions, we see that
there is only one that will actually be of great importance to define, since it is written by the composition
of a great amount of elementary functions/operators: the quotient function for integers. Thus we present
the following notation for the elementary functions:
Definition 4.2.1. Description for elementary functions
A description of an elementary function is an expression that is inductively defined as follows:
1. The symbol EZ() is a 0-ary description that describes the constant 0.
2. The symbol ES() is a unary description that describes the successor, i.e., the 1-ary function with
the expression S(x) = x + 1.
3. The symbol EP(n,i), for any n and i such that 1 ≤ i ≤ n is an n-ary description that describes the
projection, i.e., the n-ary function defined by the expression Pn,i(x1, . . . , xn) = xi.
4. The symbol EA() is a 2-ary description that describes the addition operation, that is the function
add(x, y) = x + y.
5. The symbol EM() is a 2-ary description that describes the subtraction operation for the natural
numbers, that is the function natminus(x, y) = x .− y = maxx − y,0.
43
6. The symbol ET() is a 2-ary description that describes the product operation, that is the function
prod(x, y) = xy.
7. The symbol ED() is a 2-ary description that describes the integer division operation, that is the
function qt(x, y) = qt(x, y) =⎧⎪⎪⎪⎪⎨⎪⎪⎪⎪⎩
⌊ yx⌋, x ≠ 0
0, x = 0
.
8. If G is a k-ary description, with k > 0, that describes the function g and if H 1, . . . , H k are n-
ary descriptions that describe the functions h1, . . . , hk respectively, with n ≥ 0, then EC(G,[H 1,
..., H k]) is an n-ary description that describes the n-ary function f defined by the expres-
sion f(x1, . . . , xn) = g(h1(x1, . . . , xn), . . . , hk(x1, . . . , xn)). We say that f is obtained from g and
h1, . . . , hk by composition.
9. If G is an n-ary description, with k > 0, that describes a function g then EBS(G) is an n-ary descrip-
tion that describes functions f defined by the expression f(x1, . . . , xn) = ∑z<xng(x1, . . . , xn−1, z).
We say that f is a bounded sum obtained from g.
10. If G is an n-ary description, with k > 0, that describes a function g then EBP(G) is an n-ary descrip-
tion that describes functions f defined by the expression f(x1, . . . , xn) = ∏z<xng(x1, . . . , xn−1, z).
We say that f is a bounded product obtained from g.
Just like for the description notation for the primitive recursive functions, the size of a description is
given by the number of symbols (EZ, ES, EP, EA, EM, ET, ED, EC, EBS and EBP) that make up the description.
Also, each description corresponds to only one elementary function but the reverse is not true: for a
single function there are infinitely many descriptions. We present some examples for descriptions of
elementary functions based on the rules written above:
Example 4.2.1. Lets start with the exponential function exp(x, y) = xy. We have already seen that
exp(x, y) = ∏z<y x, which can also be written as exp(x, y) = ∏z<y P2,1(x, y) and so a description for this
function is EBP(EP(2,1)).
Example 4.2.2. The next function is the predecessor function pred(x) = x .− 1, which can also be written
as pred(x) = x .− (x + 1.− x). This means we can compose the subtraction for the natural numbers with
the projection P1,1 and the subtraction of the successor with the projection P1,1. In terms of description,
we can write it as EC(EM(),[EP(1,1),EC(EM(),[ES(),EP(1,1)])]).
Example 4.2.3. Lets now analyze the function sg(x), which we have already seen that can be writ-
ten as x.− (x .− 1). Thus this function can be obtained from applying the subtraction operation to
the natural numbers to x and to its predecessor. In terms of description, we compose the descrip-
tion EM() with the descriptions EP(1,1) and the one for the predecessor function. So, we obtain
EC(EM(),[EP(1,1),EC(EM(),[EP(1,1),EC(EM(), [ES(),EP(1,1)])])]).
Example 4.2.4. We will now deduce a description for the function sg(x) = 1.−sg(x). We can furthermore
write this function as sg(x) = ((x + 1) .− x) .− sg(x). A description for the constant 1 can be written as
44
EC(EM(),[ES(),EP(1,1)]). We already have a description for the sg(x) function, and so we can obtain
the description for the sg function: EC(EM(),[EC(EM(),[ES(),EP(1,1)]),EC(EM(),[EP(1,1),EC(EM(),
[EP(1,1),EC(EM(),[ES(),EP(1,1)])])])]).
Example 4.2.5. Now for the distance function. We know we can write it as dist(x, y) = (x .− y) + (y .− x).This means that dist is the composition of the addition function with two subtractions, one of them with
swapped arguments. This resulting description is EC(EA(),[EM(),EC(EM(),[EP(2,2),EP(2,1)])]).
Example 4.2.6. The factorial function x! can be written as ∏x<z(z + 1), which means that this function
is the bounded product of the successor function. This is translated to the description EBP(ES()).
Example 4.2.7. The next one is the minimum function, given by the expression min(x, y) = x .− (x .− y)as we have seen before. This means that this function is the composition of the subtraction for natural
numbers with the projection P2,1 and with the subtraction for natural numbers of the arguments. Thus, a
description for this function is EC(EM(),[EP(2,1),EM()]).
Example 4.2.8. Our last example is the maximization function, that can be written as max(x, y) =(x .− y)+ y as we have already seen. This means that this function is the addition of the subtraction x .− ywith the second argument, which results in the description EC(EA(),[EP(2,1),EM()]).
All these conclusions are summarised in Table 4.1.
4.3 The search algorithm
Our next step, following the reasoning of Chapter 3, is to define a search algorithm for this set of func-
tions. However, just like in Section 3.3, we will first need to address the enumerability of E .
Proposition 4.3.1. The set of the elementary functions (E ) is recursively enumerable.
Proof. This proof will follow the structure of the one of Proposition 3.3.1. We will construct a correspon-
dence q between the symbols in Definition 4.2.1 and the natural numbers in the following way:
• q(EZ()) = ⟨0⟩
• q(ES()) = ⟨1⟩
• q(EP(n,i)) = ⟨2, n, i⟩
• q(EA()) = ⟨3⟩
• q(EM()) = ⟨4⟩
• q(ET()) = ⟨5⟩
• q(ED()) = ⟨6⟩
• q(EC(G,[H 1,...,H k])) = ⟨7, k, l, q(G), q(H 1), . . . , q(H k)⟩, where l is the arity of the description.
45
FunctionD
escriptionexp(x
,y)=xy=∏z<yx
EBP(EP(2,1))
pred(x)=
x.−
1=x
.−(x+
1.−x)
EC(EM(),[EP(1,1),EC(EM(),[ES(),EP(1,1)])])
sg(x)=⎧⎪⎪⎨⎪⎪⎩
0,x=
0
1,x≠
0=x
.−(x.−
1)EC(EM(),[EP(1,1),EC(EM(),[EP(1,1),EC(EM(),[ES(),EP(1,1)])])])
sg(x)=1
.−sg(x)=
((x+
1).−x)
.−sg(x)
EC(EM(),[EC(EM(),[ES(),EP(1,1)]),EC(EM(),[EP(1,1),EC(EM(),[EP(1,1),EC(EM(),[ES(),EP(1,1)])])])])
dist(x
,y)=∣x−y∣=
(x.−y)+
(y.−x)
EC(EA(),[EM(),EC(EM(),[EP(2,2),EP(2,1)])])
fact(x)=
x!=∏z<yz+
1EBP(ES())
min(x
,y)=x
.−(x.−y)
EC(EM(),[EP(2,1),EM()])
max(x
,y)=(x
.−y)+
yEC(EA(),[EP(2,1),EM()])
Table4.1:
Elem
entaryfunctions
andtheircorresponding
descriptions
46
• q(EBS(G)) = ⟨8, l, q(G)⟩, where l is the arity of the description.
• q(EBP(G)) = ⟨9, l, q(G)⟩, where l is the arity of the description.
By using the encoding function τ in Proposition 3.3.1, from [12], we can conclude that each tuple will
be encoded into a different natural number. If we consider the inverse correspondence of q, from the
naturals to the descriptions of the elementary functions, and to the numbers that are not in the range
of q we attribute them the description EZ(), we see that we have an enumeration for E , and so E is
recursively enumerable.
Proposition 4.3.2. E ∈ Ex.
Proof. To perform this proof we only need to present a scientist that Ex-identifies this set of functions
(Algorithm 11, adapted from Algorithm 5). Let π ∶ N → E be an enumeration for the primitive recursive
functions, πi denote the function obtained from π(i) and ψ the elementary function explained by a text
in the canonical form with prefix σ.
Algorithm 11 Scientist that Ex-identifies E
functionM(σ ∶ SEG)∶ Ni ∶= 0;k ∶= 0;n ∶= length of σwhile k < n do
if πi(k) = ψ(k) then k ∶= k + 1else
k ∶= 0;i ∶= i + 1;
end ifend whilereturn i
end function
This algorithm searches for the least i ∈ N such that the output of πi applied to every k < n has the
same value obtainable from σ for ψ(k). Since π is an enumeration for the set E , then this algorithm
will be exhaustive in the set of elementary functions and since σ is a prefix of a text that explains an
elementary function, this algorithm will eventually halt.
The implementation of the search algorithm will be based in Algorithm 8. We will consider the ex-
istence of a function ElFunctions (which will be defined in Section 4.4) that, given a size and an arity
for the functions, will output a list of every possible description with those values for size and arity, using
the rules in Definition 4.2.1. This will result in Algorithm 12 that will follow an identical reasoning for the
search algorithm in 8: for an arity known a priori we will list all the descriptions with size 1 and see if
there is a function in that list such that when applied to every value given in the inp list it will return the
values in the out list, respectively. If there is not a function with description’s size 1, then we will proceed
to check the list of descriptions with size 2, then the one with descriptions with size 3, and so on until we
find a description that explains correctly the values given in the input lists inp and out.
We now have an algorithm that is dependent on the implementation of function ElFunctions to be
functional, and so our next step is to implement that function.
47
Algorithm 12 Search algorithm for an elementary function given the input/output values having intoaccount the arity
Input: inp as the list of tuples with the input values; out as the list of integers with the correspondingoutput values
Output: a description of an elementary function
procedure SEARCH(inp,out)i ∶= 1a ∶= length(inp[0])for i ∈ N do:
for f in ElFunctions(i, a) dot ∶= Truej ∶= 1for j < length of inp do
t ∶= (f(inp[j]) == out[j])if t is False then
breakend if
end forif t is True then
res ∶= fbreak
end ifend forif t is True then
breakend if
end forreturn res
end procedure
4.4 Enumeration
In Algorithm 13 we see an implementation for ElFunctions and for the necessary auxiliary functions,
which was mainly based on Algorithm 9 with the necessary changes made:
• Besides yielding the descriptions for the zero constant, the successor and the projections with
given arity, for arity 2 the algorithm also yields the descriptions of the addition, the subtraction for
natural numbers, the product and the integer division operations.
• It no longer needs to have a constructor function regarding the recursion operator.
• Functions ElFunctions With Maxsize and ElCompositions are identical and work the same way
as Functions With Maxsize and Compositions in Algorithm 9, respectively.
• ElComposition Function Lists will work identically to Composition Function Lists in Algorithm
9. However, in this case we will not apply the restriction to not have in the yielded lists dupli-
cated descriptions, since it will be more useful in this case to allow such repetitions to occur (for
example, to express a description for the square function f(x) = x2, by allowing duplicate de-
scriptions in the previously mentioned lists we can have a description for this function such as
EC(ET(),[EP(1,1),EP(1,1)]); if we didn’t allow repetitions to occur, a description for f would be
48
Algorithm 13 Construction of a function list composed by elementary functions with a given descriptionsize and arity
function ELFUNCTIONS(size,ar)if size ≤ 0 then passelse if size == 1 then
if ar == 0 then yield EZ();else if ar == 1 then yield ES()
else if ar == 2 then yield EA(); yield EM(); yield ET(); yield ED();end iffor i ∶= 1 to ar do yield EP(ar,i)
end forelse
for composition in ElCompositions(size,ar) do yield compositionend forfor boundedsum in ElBoundedSums(size,ar) do yield boundedsumend forfor boundedprod in ElBoundedProds(size,ar) do yield boundedprodend for
end ifend function
function ELFUNCTIONS WITH MAXSIZE(size,ar)for subsize ∶= 1 to size do
for func in ElFunctions(subsize,ar) do yield function, size-subsizeend for
end forend function
function ELCOMPOSITION FUNCTION LISTS(length, size, ar)if length = 0 then
if size = 0 then yield []else passend if
elsefor function, remaining size in ElFunctions With Maxsize(size,ar) do
for sublist in ElComposition Function Lists(length-1,remaining size,ar) doyield [function] + sublist
end forend for
end ifend function
function ELCOMPOSITIONS(size,ar)for i ∶= 1 to size−2 do
for j ∶= 1 to size−2 dofor function list in ElComposition Function Lists(i, j,ar) do
for g in ElFunctions(size−j − 1, i) do yield C(g,function list)end for
end forend for
end forend function
49
function ELBOUNDEDSUMS(size,ar)if ar== 0 then passelse
for f in ElFunctions(size-1,ar) do yield EBS(f)end for
end ifend function
function ELBOUNDEDPRODS(size,ar)if ar== 0 then passelse
for f in ElFunctions(size-1,ar) do yield EBP(f)end for
end ifend function
a lot longer).
• ElBoundedSums and ElBoundedProds are two new functions that work in a similar way: they will first
check if the arity is not null and, if it isn’t, for any function f with description F in the list of functions
with the same arity but with size one unit smaller ElBoundedSums will yield the description EBS(F)
and ElBoundedProds will yield the description EBP(F).
.
In Figure 4.1 we have as example the descriptions with size and arity between 1 and 3. By compari-
son with lists in Figures 3.1 and 3.2 we see that now there are more descriptions with smaller size. This
happens because we have more symbols defined a priori which originates a greater number of possible
combinations for small size values; in fact, due to the symbols EBS() and EBP() we are now able to
have descriptions with size 2, which were non existent in the previous two enumerations. This piece
of information, along with the fact that there are now more operation identified with symbols defined a
priori, makes us expect that with this learning paradigm we will be able to achieve a conjecture much
faster than with the ones presented in Chapter 3.
4.5 From description to code
Since our definition of description for elementary functions (Definition 4.2.1) is different than the one
presented in Definition 3.2.1, the reasoning used in Section 3.6 to obtain the code of a program that
computes the function defined by a description cannot apply directly to these descriptions and so we
will have to make the necessary changes. Since the set E is contained in the set PRIM (in [12] and
already discussed in Chapter 3) and we know that for all the primitive recursive functions there exists
a program written only with a sequence of assignments, if-clauses and sequential and nested for-loops
(Theorem 3.1.1 and in [30]), then it will also be possible to obtain a program with these characteristics for
the elementary functions (once again, these programs will be written in Python language). We will thus
develop a program that transforms a description for elementary functions such as defined in Definition
4.2.1 into a Python program. In its structure, this program will be a lot similar to the one described in
50
(a) Size 1 and arity from 1 to 3 (b) Size 2 and arity from 1 to 3
(c) Size 3 and arity 1 and 3 (d) Size 3 and arity 2
Figure 4.1: Lists of descriptions for elementary functions with the referred size and arities.
Section 3.6: it uses an auxiliary list of variables in order to make possible the attributions needed for the
program to function correctly. These variables will once again always be tuples or integers.
Much like the program in Section 3.6, this one will write in a text file, beginning by writing in the first
two lines “def function(x):” and “a0 = x”, with the correct indentation. Throughout the program the
auxiliary list of variables will be updated, to allow the use of the correct variable every time, just like the
changes of line and the indentations. We know arrive to the differences between the two programs: the
execution depending on the symbol it reads. The rules will be the following:
• if the symbol is EZ(), then it will attribute to the correct variable the value 0.
51
• if the symbol is ES(), then it will be attributed to a new variable the value of the successor of the
adequate variable after making sure that the variable is an integer and not a tuple; if it is not an
integer then it is a tuple of only one element (due to the arities of the operations in question) and
then we say that the new variable is the successor of the first element of the tuple.
• if the symbol is EP(n,i), then the value of a new variable will be the i−th element of the appropriate
variable after making sure that that variable is a tuple; if it is not, then it is an integer and so, before
performing the projection, we need to transform the integer variable into a tuple variable and only
then perform the projection.
• if the symbol is EA(), then the value of the new variable will be the addition of the two elements of
the pair that compose the previous variable. In this case we don’t have to worry about the type of
the variable since the description will have arity 2 and so the variable in question will obligatory be
a tuple (a pair, to be precise).
• if the symbol is ET(), then the value of the new variable will be the product of the two elements of
the pair that compose the previous variable. Once again and for the same reasons as for EA() we
don’t have to worry about the type of the variable.
• if the symbol is EM(), then the value of the new variable will be the subtraction of the two elements
of the pair that compose the previous variable. Once again, there is no need to worry about the
type of the variable.
• if the symbol is ED(), then the value of the new variable will be the quotient of the second element
of the pair over the first, if the first is not 0; if it is, then it will attribute to the new variable the value
0 (this is done in order for it to be a total function).
• if the symbol is EC(G,[H 1, ..., H k]) then the program will write the correct code for computing
the output of the h 1,...,h k functions, attribute those values to the correct variables, create a
tuple variable composed of the different outputs of the previous k functions and then attribute to
the correct value the one computed by applying function G to that tuple of values.
• if the symbol is EBS(G) the program will first make sure that the input variable is a tuple (if it is not, it
will turn it into one), attributes to a new variable a tuple with the values of the input variable except
the last one and then it will give to a new variable the value 0 that will be updated throughout the
for-loop that will be written after (meaning that when the last element of the input pair is 0 and
the for-loop does not execute, the returned value will be 0). This for-loop will be executed the
same number of times as the value of the last element of the input variable and it will execute the
following commands: for the i-the execution, a new tuple variable will be created that will have as
values the same as the input variable but with the last one swapped by i; then it will execute the
code of respective to description G and update the variable declared before the for-loop began
(that started with the value 0 assigned) by adding to its current value the result of the application
of function defined by G; in the end, it returns this updated variable’s value.
52
• if the symbol is EBP(G) the program will act very similarly to when the symbol is EBS(G): the only
changes are that the variable that has initially value 0 in the previous case is initialized with the
value 1 (which will make the result 1 when the for-loop is not performed, i.e. when the last element
of the input pair is 0) and that this value is updated by multiplying its value with the value obtained
after the application of the code relative to description G instead of being added.
In the end, this program will write a line of code that allows the correct variable to be returned.
It is easily seen that the way this program functions for the symbols EZ(), ES(), EP(n,i) and
EC(G,[H 1,...,H k]) functions in the same way than the program in Section 3.6 for Z(), S(), P(n,i)
and C(G,[H 1,...,H k]), respectively. We will now proceed to demonstrate how this program works
for the other symbols. Just like in Section 3.6 this descriptions are simple ones; their only purpose is to
easily show how the translation from the description to code works.
Example 4.5.1. The first example will be the one with the description EA(). We see the respective code
in Figure 4.2.
Figure 4.2: Code for function with description EA()
In lines 1 and 2 we see the standard beginning for every program obtained this way. In line 3, the
addition is performed and in line 4 it is written the command for the return of the adequate variable.
Example 4.5.2. Next, in Figure 4.3 we see what happens when the description is EM().
Figure 4.3: Code for function with description EM()
After the common two lines of code, in line 3 we verify if the first element of the input pair is bigger
than the second one and the instruction to, if it is, proceed to the subtraction. In line 4 we see the
instruction of what to do if the if guard in the previous line fails: it attributes to the new variable the
value 0. In line 5 we have the return of the appropriate variable.
Example 4.5.3. The next example is visible in Figure 4.4 and is related to description ET().
Figure 4.4: Code for function with description ET()
It is a simple one: in line 3 we see the product operation being applied to the two elements of the
input pair and in line 4 the return of that result.
53
Figure 4.5: Code for function with description ED()
Example 4.5.4. We will now proceed to analyze in Figure 4.5 what happens for description ED().
We have in line 3 the verification of if the first element of the input pair is 0 and the commands for what
to do if it is: attribute to the new variable the value 0. In line 4 we have what happens if the verification
fails. We simply perform the integer division (represented by //) of the last element of the pair by the
first. In line 5 we have the return of this value.
Example 4.5.5. Our next example will be the one in Figure 4.6 regarding description EBS(ES()).
Figure 4.6: Code for function with description EBS(ES())
In line 3 we see the code that makes sure that the input variable is a tuple. Next, in line 4 it is created
a new tuple with the same values as the previous one except the last one. It is created a new variable a2
with value 0 in line 5 and then in line 6 we proceed to define the for-loop. This loop will be executed the
same number of times as the last element of the input variable and will perform the following instructions:
in line 6 it is created a variable composed by the elements of the input tuple with the last one substituted
by i (which is the number of the iteration of the loop). That variable will be the input for the code in lines
8 and 9 relative to the function with description ES() and then the variable initialized as 0 before the
for-loop began will be updated with the result of the previous lines of code by adding this value to the
value it already had. Once again, in the end we see the return instruction for the correct variable.
Example 4.5.6. Lastly we have what happens for description EBPS(ES()) in Figure 4.7.
Figure 4.7: Code for function with description EBP(ES())
By analyzing this example with Example 4.5.5 we see that what happens is extremely similar, except
in line 5 where the variable is initialized as 1 and in line 10 where the variable is updated by performing
the product of the values of the variables instead of its addition.
54
Chapter 5
Results
We will now put our algorithms to the test. To do so it was used a computer with operating system
Windows 10 and processor Intel(R) Core(TM) i5-4210U CPU @ 1.70 GHz 2.40 GHz, with the Python
version 3.7.0 installed. In appendix B it is explained in which files the algorithms are implemented and
where they can be found.
In the second (Section 5.2) and the third (Section 5.3) algorithms something peculiar happens: it was
noticeable that sometimes our algorithms got stuck in some descriptions, i.e. they took a lot of time to
obtain the result of applying the current description to some input values inserted. To understand what
was happening, we added a line of code that printed the current description next to its size. However,
this operation somewhat slowed (a lot) the computation of the algorithms and thus we only enabled it for
provisional results; every computational time presented was obtained without this functionality enabled.
Our methodology will be the following: we run the algorithm for inp and out values that explain the
functions whose descriptions we already know (Table 3.1). Then we will explore the algorithms beyond
these functions to see their limitations. Furthermore, we start by providing only one element in each of
the inp and out lists and if the scientist returns a description that doesn’t match the one we want to, we
will slowly add information to both lists until we find the intended description.
Before beginning to see the scientists in action we need to present one definition that will aid us to
understand and explain our results: the notion of locking sequence.
Definition 5.0.1. (see [32] and [11]) Locking Sequence
Let ψ ∈ R, M a scientist and σ ∈ SEG. We say that σ is a locking sequence for M on ψ if (a)
content(σ) ⊂ ψ, (b) φM(σ) = ψ and (c) for all τ ∈ SEG such that if content(τ) ⊂ ψ thenM(σ τ) =M(σ).
This concept is important because from the moment a scientist founds a locking sequence σ, it
converges immediately in all texts for ψ having σ as prefix, and thus if a small locking sequence is found,
then the scientist will converge very fast and will do so for many different inputs.
Lets now proceed to the presentation of results.
55
5.1 First algorithm
In this section we will analyze the implementation of the search procedure in Algorithm 6 that uses the
enumeration described in Algorithm 7.
Experiment 5.1.1. Our first function is id ∶ N → N defined by the expression id(x) = x. The goal is that
our scientist returns the description P(1,1). In Figure 5.1 we can see that by providing as inp list the list
of tuples [(1, )] and as out list the list of integers [1], the scientist finds the description P(1,1), which we
know corresponds to the identity function, in 0.003996 seconds. This means we only needed a prefix
with length one to find a locking sequence for this scientist on this function. We tested this result for
input values (3, ), (4, ) and (5, ) and, as expected, the results were 3, 4 and 5.
(a) Lists of inp and out (b) Description found and thetime it took in the bottom
(c) Code obtained
Figure 5.1: Results for the identity function searched by the first scientist
Experiment 5.1.2. The next experiment relied on observing the behavior of the algorithm regarding the
function s ∶ N → N defined as s(x) = x + 1. By providing as inp list [(1, )] and as out list [2] the algorithm
returned the description S() (which obviously describes the function in hand) in 0.000996 seconds. To
test the outputed description/code we provided as input (2, ), (3, ) and (4, ) which resulted in outputs 3,
4 and 5, as anticipated.
Experiment 5.1.3. The function pred ∶ N → N defined by the expression pred(x) = x.− 1 is the next
one. We provide the algorithm the lists [(1, )] (inp list) and [0] (out list) and observe the outcome: the
procedure terminates in approximately 0.004997 seconds and returns a description for the predecessor
function, R(Z(),P(2,1)). We provided as input the values (0, ), (2, ) and (7, ) to perform the test which
resulted in the outputs 0, 1 and 6, as we thought it would.
Experiment 5.1.4. Now we try to realize if the algorithm can identify the unary zero function. By the
previous experiment, we know that we should not provide the same inp and out lists because that way
the scientist will not return the description of the zero function but the description of the predecessor
function. Thus, by providing as inp list [(2, )] and as out list [0] the procedure returned the description
R(Z(),P(2,2)) which the Table 3.1 tells us it is a description of the function zero ∶ N → N defined as
zero(x) = 0. This computation terminated in approximately 0.003999 seconds. We experimented with
values (7, ) an (14, ), which resulted in the output values 0 and 0 as we hoped it would.
Experiment 5.1.5. Lets now see what happens with the function sx ∶ N2 → N defined as sx(x, y) = x+ 1.
Providing as inp list the list [(2,4)] and as out list the list [3], we can see (Figure 5.2) that the scientist
56
finds an appropriate description associated with this function: C(S(),P(2,1)). This computation took
approximately 0.001956 seconds. By testing with other input values ((3,5) and (7,2), which outputed
respectively 4 and 8) we have stronger reasons to belief that the description is adequate.
(a) Lists of inp and out (b) Description found and thetime it took in the bottom
(c) Code obtained
Figure 5.2: Results for the successor function after the projection of the first argument searched by thefirst scientist
Experiment 5.1.6. We proceeded to analyze the results for the function add ∶ N2 → N defined by the
expression add(x, y) = x+y. By providing the algorithm with [(2,3)] as inp list and [5] as out list, we see
that the algorithm terminates in 0.004942 seconds and returns the description C(S(),C(S(),P(2,2)))
which corresponds to the function defined as the successor of the successor of the second argument,
which was not the outcome we pretended (proved by the fact that the test made with input (9,7) re-
sulted in output 9 instead of 16). Thus there was a need to provide more information, which was done
by extending the inp and out lists with other data points. This means that, for the first time, the infor-
mation we provided to the scientist at the first attempt was not a locking sequence for the scientist on
the function in hand. Our next attempt at this function was made with the lists [(2,3), (1,5)] and [5,6]as inp list and out list, respectively. This computation already terminates in the expected description
R(P(1,1),C(S(),P(3,3))) (see Table 3.1), taking 0.012939 seconds to do so (Figure 5.3). By testing
the input pair (5,8), which resulted in output 13, we obtain more evidence that the description is an ade-
quate one. Furthermore, we wanted to see the behaviour of the scientist with an inp list with more pairs
and/or with elements of greater value. Thus, we first gave the scientist inp list [(1,4), (2,1), (3,2), (0,6)]and out list [5,3,5,6], followed by providing as inp list the list [(13,24), (35,41), (133,256), (420,513)]and as out list the list [37,76,389,933]. The expression returned in both cases was the same as in the
last attempt; the first needed 0.015591 second to terminate while the seconds took 0.067953 seconds
to do so.
Experiment 5.1.7. Next we experiment with the function sub ∶ N2 → N defined as sub(x, y) = x.− y.
By firstly providing as inp and out lists the lists [(5,2)] and [3], respectively, the procedure returns the
description C(S(),[P(2,2)]) in 0.001001 seconds, which corresponds to the function that calculates
the successor of the second argument. In fact, this function also explains correctly the information pro-
vided to the algorithm but if we test for other values it is not what we pretended (for example for input
(7,2) it outputs 3 instead of 5) and so, like we did in the previous example, we will need to increase
the data given to the algorithm. Now we want to expand our inp and out lists so that the scientist
understands that this description is not the one we want. This means that by adding to our inp list
57
(a) Lists of inp and out (b) Description found and thetime it took in the bottom
(c) Code obtained
Figure 5.3: Results for the second attempt for the addition function searched by the first scientist
the tuple (2,0) and to our out list the element 2, we observe that the scientist returns the description
R(P(1,1),C(S(),[C(S(),[P(3,2)])])) (Figure 5.4), which still is not an adequate description for this
function, proven by the tests performed: for input tuple (6,4) instead of returning the result 2 it outputed
the value 5. We then proceeded to a third attempt to find a desired description by providing the sci-
entist the list [(5,2), (2,0), (4,1)] as inp list and the list [3,2,3] as out list. This time the result we got
was a good one, with the scientist returning the description R(P(1,1),C(R(Z(),P(2,1)),[P(3,3)]))
in 0.432695 seconds, which is the description deduced in Example 3.2.4. After this we tried again with
only two points in order to see if it was possible to find a locking sequence with length two, opposed to
the previous one which had length three. To do so, this time our inp and out lists were [(5,2), (4,1)]and [3,3] respectively. We then observe that the scientist found the intended description in 0.495662
seconds, as we can see in Figure 5.5. Furthermore, we performed another attempt for input pair in a
bigger number with greater elements, so we provided inp list [(34,12), (25,40), (151,72), (627,728)] and
with out list [22,0,79,0]. This resulted in the scientist returning the same description as in the previous
attempt, which took it 4.57001 seconds to find.
(a) Lists of inp and out (b) Description found and the time it tookin the bottom
Figure 5.4: Results for the second attempt for the subtraction function searched by the first scientist
Experiment 5.1.8. We then tried to see if the scientist was able to find the description of the prod-
uct function prod ∶ N2 → N defined as prod(x, y) = x × y. Our first attempt was to see if it was
possible with the inp list [(2,3)] and the out list [6]. After 0.014799 seconds, the scientist returned
58
(a) Lists of inp and out (b) Description found and the time ittook in the bottom
(c) Code obtained
Figure 5.5: Results for the repetition of the second attempt for the subtraction function searched by thefirst scientist
the description R(S(),C(S(),[P(3,3)])), which besides not being the intended description (see Ta-
ble 3.1) it does not output the correct value when tested for input (6,4) since it outputs 11 instead of
24, making us conclude that this description is not adequate to describe this function. We then en-
larged the lists provided to the scientist and we gave as inp list [(2,3), (5,2)] and as out list [6,10].The scientist took 8495.14 seconds (approximately 2 hours and 20 minutes) to find the description
R(R(Z(),P(2,2)),C(R(P(1,1),C(S(),[P(3,3)])),[P(3,1),P(3,3)])), which is an appropriate one
to describe the product function since it is equal to the one deduced in Example 3.2.5. This is corrobo-
rated by performing the test to the input values (5,6) and (7,3) (which returned 30 and 21 respectively).
Remark: Do not worry with the computational time for this experiment; the algorithm used in Section
5.2 will present computational times much more adequate for this search (see Experiment 5.2.3) and
the one in Section 5.3 already has a description for the product function defined a priori.
Due to the results of the last experiment, it is obvious that this algorithm is not very efficient, since
it took almost two and a half hours to find the description of a function as simple as the product. Thus,
since the description for the distance function is much bigger than the one for the product we will not
proceed to experiment with that function. We will now perform some experiments for functions to which
we don’t know descriptions a priori.
Experiment 5.1.9. An experiment was made to see if the scientist would find a description for values
obtained through the function f ∶ N→ N defined as f(x) = 2x. We first provided [(0, )] and [0] as inp and
out lists, respectively; it returned the description for the identity function P(1,1) which is obviously not an
intended result. We then added one element to each list in order to provide as inp the list [(0, ), (1, )] and
as out the list [0,2]; it returned the description R(Z(),C(S(),[C(S(),[P(2,1)])]))). By testing this re-
sult with other values, we see that this is still not a description for the function in question, since it outputs
7 for a given value of (6, ) and 9 for input (8, ) (in fact, this description describes the function that outputs
0 if the input is 0 and the successor of the input otherwise). We then augmented the inp and out lists one
more time to [(0, ), (1, ), (2, )] and [0,2,4]; this time, it returned R(Z(),C(S(),[C(S(),[P(2,2)])])),
59
(a) Lists of inp and out (b) Description found and the time it took in the bottom
(c) Code obtained
Figure 5.6: Results for the second attempt for the product function searched by the first scientist
which we tested for other values. For input 12 it returned 24 and for input 25 returned 50, which gives
us confidence that this is an adequate description for the function f in question. It took the scientist
0.658629 seconds to do so. We then saw what happened if we inserted only one element in each list,
but a bigger one. For example, for the pair of values for inp and out [(3, )] and [6] it returned the de-
scription C(S(),[C(S(),[S()])]) which is simply the successor applied three times and it is not what
we pretended. However, for inp and out [(12, )] and [24], the description returned was the one we were
looking for: R(Z(),C(S(),[C(S(),[P(2,2)])])); this execution took 0.730134 seconds to halt, which
means that, with a small loss over the computational time, we can find a smaller locking sequence for
this experiment.
Experiment 5.1.10. Our next experiment is the one regarding the function f ∶ N → N which is de-
60
fined as f(x) = (x + y) .− 1. We began by inserting [(1,2)] as inp and [2] as out. As expected, it
returned the description for the projection of the second element of the pair, P(2,2). We then pro-
ceeded to enlarge both the inp and out lists to [(1,2), (2,1)] and [2,2], respectively, which resulted in
the scientist returning the description R(R(Z(),P(2,1)),C(S(),[P(3,3)])). We tested this result for
other input values: for (12,4) it returned 15 and for (35,21) the output was 55, which are the correct
outputs for applying the function in question to said input values. So we conclude that, to the best
of our knowledge, this last description is one that explains the function f(x) = (x + y) .− 1 adequately
and we observe that the scientist took 0.995186 seconds to find it. Furthermore, we once again tried
to find this description using only one element in each list: we succeeded for inp list [(19,8)] and out
list [26], for a computation time of 0.825140 seconds. This time, finding a smaller locking sequence
actually resulted in a gain when it comes to the computational time of the procedure. We also pre-
tend to observe the result of providing bigger lists and/ot lists with greater values to the scientist, and
so we first provided the inp list [(2,0), (5,4), (2,2), (1,3)] and out list [1,8,3,3] and then we provided
[(25,31), (48,57), (237,192), (540,371)] as inp list and [55,104,428,910] as out list. This resulted in the
output of the same description in both cases as in the last two attempts, in computations that took
0.611785 and 2.78675 seconds to terminate, respectively.
We present a summary of the experiments performed in Table A.1.
5.2 Second algorithm
Now we present the results obtained by executing the search algorithm in Algorithm 8, which used the
enumeration procedure in Algorithm 9. We will not repeat the first experiments presented in Section 5.1,
since those results will be extremely similar to the ones presented there; we want to see the differences
between this algorithms in the cases where the first one had more difficulties to return an adequate
description. In this algorithm, we already have descriptions with projections with arity greater than
3, which increases the possibility of the scientist finding appropriate descriptions for the concerned
functions that are different from the ones in Table 3.1.
Experiment 5.2.1. Our first experiment is referred to the addition function. We provided as inp list
[(2,3)] and as out list [5], which resulted in the scientist returning C(C(S(),[S()]),[P(2,2)]); although
this is not the same one as the first description obtained in Experiment 5.1.6, it performs the same oper-
ation: the successor of the successor of the second argument of the input pair. Once again, by enlarging
both inp and out lists to [(2,3), (1,5)] and [5,6], respectively, we find, in only 0.002997 seconds, an ad-
equate description (that is equal to the one found in Experiment 5.1.8): R(P(1,1),C(S(),[P(3,3)])).
Experiment 5.2.2. For the subtraction function for the natural numbers, we began by providing the
same inp and out lists as in Experiment 5.1.7: [(5,2)] and [3], respectively. It returned the same result,
C(S(),[P(2,2)]). We then increased the lists the same way, [(5,2), (2,0)] for inp and [3,2] for out.
This time, the returned description was R(P(1,1),R(P(2,1),P(4,3))). Since we don’t know to what
function this description is related to, we tested it to understand its behaviour. We saw that for pairs (x, y)
61
such that x > y the obtained result explained correctly the subtraction function: for input (8,3) returned 5
and for input (15,7) returned 8. However, when testing for (4,7), for example, it outputed 2 instead of 0,
leading us to conclude that this description is not an adequate one. To prevent that from happening, we
chose specific values to add to the inp and out lists, for example (4,7) to inp and 0 to out. This resulted
in the return of the description R(P(1,1),C(R(Z(),P(2,1)),[P(3,3)])), which we know is adequate to
explain the function in hand (see Table 3.1). This computation took 0.041919 seconds to terminate. The
same way, just by giving as inp [(5,2), (4,7)] and out [3,0] the scientist was still able to find the correct
description in approximately the same computational time. Also, we wanted to see what happened
when the inp list provided was composed with more input pairs and with bigger value inputs. Thus, we
first gave the scientist the list [(2,1), (3,6), (4,0), (5,2)] as inp list and the list [1,0,4,3]; the scientist
returned the same description as the one in the last attempt after 0.031918 seconds of searching. Then,
we provided the scientist with inp list [(20,10), (15,7), (34,57), (60,61)] and with out list [10,8,0,0] what
resulted in the scientist outputing the previous description as the last two attempts, in a computation that
took 0.578816 seconds.
Experiment 5.2.3. Now regarding the product function. We also began with inp list [(2,3)] and out list
[6], like in Experiment 5.1.8. This returned the description R(P(1,1),C(R(Z(),P(2,1)),[P(3,3)])),
which we already saw in Experiment 5.1.8 was not an adequate description for this function. We then
proceeded to enlarge the inp and out lists in the same way: to [(2,3), (5,2)] and [6,10]. With this
information, the scientist returned the description R(R(Z(),P(2,2)),R(P(2,1),C(S(),[P(4,4)]))) in
only 1.12452 seconds (Figure 5.7), what contrasts a lot with the time needed to perform the computation
in Experiment 5.1.8. By testing this result with other values, we have no reason to suspect that this is
not an adequate description for the product function: (2,9) returned 18, (6,4) returned 24 and (85,96)outputed 8160. Furthermore, we want to see what would happen if a larger inp list and if a inp list with
bigger values was given to the scientist. Thus, we provided [(5,0), (2,3), (4,3), (6,3)] as inp list and out
list [0,6,12,18], which resulted in the scientist returning the same alleged adequate description as in the
previous attempt in a computation that took 3.42313 seconds. Then we proceeded to give the scientist
[(7,12), (30,14), (126,73), (256,421)] as inp list and [84,420,9198,107776] as out list. With this data, the
scientist did not go beyond verifying R(S(),R(P(2,1),R(P(3,1),C(S(),[P(5,5)])))).
Experiment 5.2.4. The next function is the double function f ∶ N → N defined by f(x) = 2x. This time
we will start with inp and out lists different from those in Experiment 5.1.9. We then start with inp list
[(2, )] and out list [4]. This resulted in the output of the description C(S(),[S()]), which is obviously
not a description for the double function since it describes the successor of the successor of the input.
We then attempted with [(2, ), (3, )] as inp list and [4,6] as out list, which returned the description
R(Z(),C(C(S(),[S()]),[P(2,2)])) in 0.030945 seconds. This is a description that we know from
Experiment 5.1.9 describes the function in hand, to the best of our knowledge. We also tried to see if we
could find this description with only one element in each list and so we gave to the scientist the inp list
[(5, )] and the out list [10]. This also returned the correct description, but this time in 0.040974 seconds,
which is a time a little worse than the one of the computation that returned the same description but with
two elements in each inp and out list.
62
(a) Lists of inp and out (b) Description found and the time it took in the bot-tom
(c) Code obtained
Figure 5.7: Results for the second attempt for the product function searched by the second scientist
Experiment 5.2.5. Lets now see what happens regarding the function f ∶ N2 → N defined by the expres-
sion f(x, y) = (x + y) .− 1. First we provided as inp [(1,2)] and as out [2]; this returned the description
P(2,2), which is obviously not a result we wanted to obtain. We then enlarged inp to [(1,2), (2,1)] and
out to [2,2], to which the scientist outputed the description C(S(),[R(P(1,1),R(P(2,1),P(4,3)))]).
By testing this description, we saw that it is still not an adequate one for describing this function
since for input (4,10) it returned the value 5 instead of 13. We thus enlarged again our input lists to
[(1,2), (2,1), (2,4)] (inp) and to [2,2,5] (out). This computation took 0.071959 seconds and returned the
description R(R(Z(),P(2,1)),C(S(),[P(3,3)])) (Figure 5.8), which we believe is an adequate descrip-
tion for this function from Experiment 5.1.10 and from testing it for other values (for input (8,7) outputed
14 and for (4,9) it returned 12, as it was expected). We tried to see if it was possible to achieve a correct
description using only one element in each list, which we did with inp list [(13,6)] and out list [18] in
63
0.141918 seconds, taking more or less the double of the time as the search with inp [(1,2), (2,1), (2,4)]and out [2,2,5]. Moreover, we also wanted to see if there was any difference in providing bigger lists
or lists with pairs containing greater elements. To do so, we began providing the scientist the lists
[(2,6), (3,0), (4,2), (1,5)] and [7,2,5,5] as inp and out lists, respectively. The resulted in the scientist
returning the same supposed appropriate description as in the previous attempt in a computation that
took 0.215976 seconds. Next, we gave the scientist inp list [(16,24), (73,51), (127,245), (318,182)] and
out list [39,123,371,499], what resulted in the scientist returning the same description of the last attempt,
in 2.17263 seconds.
(a) Lists of inp and out (b) Description found and the time ittook in the bottom
(c) Code obtained
Figure 5.8: Results for the third attempt for the function f(x, y) = (x + y) .− 1 searched by the secondscientist
Experiment 5.2.6. Our next experiment was made regarding the function f ∶ N2 → N defined by the ex-
pression f(x, y) = (x+y)x. First, we inserted as inp list [(3,1)] and out list [12], which resulted in the sci-
entist returning the description R(S(),R(P(2,2),R(P(3,1),C(S(),[P(5,5)])))) in a computation that
took 1.26955 seconds. We tested this result for input value (5,3); it should have returned 40 but instead
it returned 757, which means that this description is not an adequate one. We then tried to see what
happened with inp list [(3,1), (2,4)] and out list [12,12]; in this case the scientist took a lot of time in the
verification of the description C(S(),[R(S(),R(P(2,2),R(P(3,3),C(C(S(),[S()]),[P(5,5)]))))]).
This happens because the computation of the function described by this description needs to per-
form several nested for-loops whose number of iterations is very large, even for small input values
such as the ones given. To try to prevent this from happening we will change the inp list to con-
tain even smaller values and out to the list containing their correspondent outputs: our inp becomes
[(0,1), (1,0), (1,2), (2,1), (0,2), (2,0), (2,2)] and our out [0,1,3,6,0,4,8]. This way, we try to minimize
as much as we can the number of times those for-loops are executed, in order to prevent the scientist
to get stuck on descriptions like these. However, these inp and out lists still don’t let us perform the
search after description R(P(1,1),R(P(2,1),R(P(3,1),R(P(4,1),R(P(5,1),C(S(),[P(7,7)]))))));
this is due to the fact that for the first three input pairs the function described by this description can
compute their correspondent outputs which are the same as the ones in out list, but when we reach
64
the pair (2,1) the fact that there are at least 5 nested for-loops (one for each symbol R) will once
again increase a lot the time of computation. Another step we took to try to prevent this situation was
to sort the elements in our lists in different way, trying to force the scientist to make the comparisons
that we suspect will need to execute these loops the least number of times first. Thus, we proceeded
to provide as inp list [(0,1), (0,2), (1,0), (1,1), (1,2), (2,0), (2,1), (2,2)] while the out list inserted was
[0,0,1,2,3,4,6,8]. With this input, the scientist, after more than 24 hours of computation, still had not
found a description for a function that explains the relation between the elements of inp and the ones of
out, although not being stuck in the verification of any description. At this point, we made the decision of
terminating this computation.
Experiment 5.2.7. We will now see what happens when provide to the algorithm information that ex-
plains the distance function, defined by dist(x, y) = ∣x − y∣. We began by establishing as inp list [(4,3)]and as out list [1]. This led the scientist to return the description R(P(1,1),R(P(2,1),P(4,3))) in
0.003997 seconds. However, by testing this result we conclude that it is not a correct one since
for input (7,9) it returns 6 instead of 2, although it returns a correct output to input (15,9) (it re-
turned 6). We then changed our inp list to [(4,3), (2,6)] and our out list to [1,4]; it returned the
description C(R(Z(),P(2,1)),[R(S(),P(3,2))]), which when tested with input (23,14) returned the
wrong result of 12 instead of 9. This led us to conclude that this description was also not the cor-
rect one. Our next attempt was made with inp [(4,3), (2,6), (1,3)] and out [1,4,2], which returned
C(S(),[R(S(),R(R(P(1,1),P(3,2)),R(P(3,3),P(5,4))))]). By testing input (10,2), which should
have returned 8, the outcome was 6, and so we conclude that this description is not adequate. We
then tried with inp list [(4,3), (2,6), (1,3), (3,7)] and out list [1,4,2,4]. It outputed the same description
as the last attempt, C(S(),[R(S(),R(R(P(1,1),P(3,2)),R(P(3,3),P(5,4))))]). We then appended
the last test input to inp list and its correspondent correct output to out list, resulting in providing the
scientist the lists [(4,3), (2,6), (1,3), (3,7), (10,2)] and [1,4,2,4,8]; it did not went past the verifica-
tion of the description R(S(),R(P(2,1),R(P(3,1),C(C(S(),[S()]),[P(5,5)])))). At this point, we
made a drastic change in the inp and out lists provided: we adopted a similar reasoning as in the Ex-
periment 5.2.6 and inserted as inp the list [(0,1), (0,2), (1,0), (1,1), (1,2), (2,0), (2,1), (2,2)] and as
out the list [1,2,1,0,1,2,1,0]; with these lists as input, the description returned by the scientist was
C(R(S(),P(3,1)),[R(P(1,1),R(P(2,2),P(4,3))),P(2,1)]). When this result was tested with input
(2,6) the returned value was 5 which, once again, is the wrong one, since it should have been outputed
the number 4. That means that this description still isn’t the correct one. We then added this input test-
ing pair to the inp list, resulting in an inp list of [(0,1), (0,2), (1,0), (1,1), (1,2), (2,0), (2,1), (2,2), (2,6)],with respective out [1,2,1,0,1,2,1,0,4]. With this information, the scientist returned the description
R(P(1,1),C(R(S(),R(P(2,2),P(4,3))),[P(3,2),P(3,1)])), a description that when tested with in-
put (4,3) returned 3 instead of 1; once again, the scientist did not returned an appropriate descrip-
tion. Once again, we proceeded to add the previous test pair to inp list, providing to the scientist
[(0,1), (0,2), (1,0), (1,1), (1,2), (2,0), (2,1), (2,2), (2,6), (3,4)] as inp and [1,2,1,0,1,2,1,0,4,1] as out.
The outcome was R(P(1,1),C(R(S(),R(P(2,2),P(4,3))),[P(3,2),P(3,1)])). We tested it for inputs
(6,2) and (4,3) whose return was, respectively, 5 and 1 thus being correct for the second input pair but
65
not for the first, where the result should have been 4. We then proceeded into adding the pair (6,2) into
the inp list and its respective accurate output to out list, what resulted in a computation of over 24 hours
that, besides not returning a conjecture, it kept on testing and verifying different descriptions. At this
point, we terminated the procedure, without having any result to present.
Experiment 5.2.8. Our last experiment regarding this algorithm was based on the exponential function
defined as exp(x, y) = xy. We started by giving the scientist the lists [(2,3)] as inp and [8] as out. The
scientist quickly found the description R(P(1,1),C(C(S(),[S()]),[P(3,3)])) that explains the relation
between this values. However, by testing it for other inputs, like (3,3) and (1,6) we see that although it
outputs the correct result for the first one (9), the output of the second one, that should be 1, is 16 and so
we conclude that this description is not the correct one. We then proceeded to enlarge our information
lists: inp to [(1,3), (2,3)] and out to [1,8]. The search procedure did not advance through the descrip-
tion R(S(),R(P(2,1),R(P(3,1),C(R(Z(),P(2,2)),[P(5,4)])))). We then used the same technique
as in Experiment 5.2.6 and changed our inp to the list [(0,1), (0,2), (1,0), (1,1), (1,2), (2,0), (2,1), (2,2)]and our out to [0,0,1,1,1,1,2,4]. This had as result the return by the scientist of the description
C(R(Z(),R(S(),P(3,1))),[R(P(1,1),R(P(2,1),R(P(3,3),C(S(),[P(5,5)]))))]), which took a time
of 13770.8 seconds to compute (approximately 3 hours and 50 minutes). However, we can see that this
description is not the correct one by testing it with input (4,1): it should have outputed 4 but it returned 9.
We then tried to include this input pair in our inp list and its correspondent output in the out list; after 24
hours of computation, the scientist did not return any conjecture while not being stuck in the verification
of any description, and thus we aborted the search procedure.
We present a summary of all the experiments in Table A.2.
5.3 Third algorithm
Lastly, we see the results of the search procedure in Algorithm 12, which used the enumeration de-
scribed in Algorithm 13. Since we are now using the elementary functions as the scientist’s learning
environment and it is used a distinct definition of description (and consequently an entire different way
of listing the functions), we will proceed into performing experiments with the most basic functions again
(except for those that are in the basis of the construction of description in Definition 4.2.1: the addition,
the product, the subtraction and the division, since we know those will be described by EA(), ET(), EM()
and ED(), respectively).
Experiment 5.3.1. We began the study of this algorithm’s functioning with the predecessor function
defined by the expression pred(x) = x .− 1. We started by giving as inp the list [(0, )] and as out the
list [0]. Expectantly, it returned the description EP(1,1), regarding the identity function. We proceeded
with [(0, ), (1, )] to be inp and [0,0] to be out; this resulted in the scientist returning the description
EBS(EP(1,1)), which describes the sum of all the elements smaller than the one given as input. This
result fails to describe the function we want: when tested for for inputs (4, ) and (13, ) it returns 6 and
78, instead of 3 and 12. We then tried with lists [(0, ), (1, ), (2, )] and [0,0,1], which resulted in the
66
same description outputed. For the fourth attempt, we provided the list [(0, ), (1, ), (2, ), (3, )] as our inp
and [0,0,1,2] as our out. This computation, that can be seen in Figure 5.9, took 0.015645 seconds
and resulted in description EBS(EBS(EBP(EP(1,1)))), which when tested for other input values always
returned the right output. And so, we have confidence that the description found is an adequate one for
the predecessor function. We also tried to understand if there was any prejudice in providing more and/or
bigger values; to do so we first gave the scientist inp list [(1, ), (3, ), (5, ), (15, )] and out list [0,2,4,14],followed by an attempt performed with [(1, ), (3, ), (5, ), (15, ), (260, )] as inp list and [0,2,4,14,259] as
out list. The result in both cases was the same alleged adequate description for this function, which was
found in a time of 0.011996 seconds in the first case and in 2.74342 seconds in the second case.
(a) Lists of inp and out (b) Description found and the time ittook in the bottom
(c) Code obtained
Figure 5.9: Results of the fourth attempt for the function pred(x) = x .− 1 searched by the third scientist
Experiment 5.3.2. Our next experiment was made regarding the function with expression zero(x) = 0.
We provided the inp list [(3, )] and the out list [0], which resulted in the description EBP(EP(1,1)). This
description is related to the function that performs the product of all the numbers smaller than the input
given; however, for input (0, ) the bounded product is not executed any time and, by definition, it outputs
1, which means that this description is not an adequate one for this function. With this in mind, we
added (0, ) to our inp list and 0 to the out list, resulting in [(0, ), (3, )] and [0,0]. To these inputs the
scientist returned the description EC(EBP(EP(1,1)),[ES()]) in 0.031248 seconds, which by performing
several tests for different inputs, it returned always the value 0. This makes sense since the description
describes the function that executes the product of every natural number smaller than the successor of
the input value; this means that this function will perform the product of the numbers smaller than the
successor of the input value, amongst which is always the value 0, and so the output of this function will
always be 0, an so we conclude that this description is an adequate one, to the best of our knowledge.
Experiment 5.3.3. Now, lets see what happens with the function with expression f(x) = 2x. Our first at-
tempt was made with inp [(1, )] and out [2]; that resulted in the description for the successor, ES(), which
is obviously not the correct one. We then tried with inp [(1, ), (2, )] and out [2,4], to which the scientist
67
returned the description EC(ES(),[EBS(ES())]). By testing this result with input (15, ) we observe that it
returns the wrong value: 16 instead of 10, and so we conclude that the description is not adequate. The
next attempt was made with the list [(1, ), (2, ), (3, )] as inp and [2,4,6] as out. The returned description,
in a computation that took 0.015624 seconds, was EC(EA(),[EP(1,1),EP(1,1)]) which describes the
function that sums the input value with itself; obviously this description is an adequate one, and so we
don’t need to make another attempt.
Experiment 5.3.4. We go on to the function defined by the expression f(x, y) = x + y .− 1. We begin
with inp list [(2,3)] and out list [4]; the result of this attempt is the description regarding the function
that performs the successor of the second element of the input pair: EC(ES(),[EP(2,2)]). This is
not the description pretended, and so we need to execute the search again. We thus extend our inp
list to [(2,3), (1,4)] and the out list to [4,4]. The description returned is EBS(EBP(EBP(EP(2,1)))),
which we tested for other values. For input pair (9,1) it returned 1 instead of 9, and so we con-
clude that this description is not the correct one. For the next attempt, we provided to the scientist
inp list [(2,3), (1,4), (3,1)] and out list [4,4,3] what returned EC(EBS(EBS(EBP(EP(1,1)))),[EA()]) in
0.093745 seconds. By testing this with other values we get the belief that this description is an adequate
one, since, for example, for input (14,3) it outputs 16 and for input (23,14) it returns 36.
Experiment 5.3.5. The next experiment was made regarding the function defined by the expression
f(x, y) = (x + y)x. Our first attempt was made by providing the scientist the lists inp [(1,3)] and out
[4], which returned the description of the addition function EA(). We then advanced to another attempt
with inp list [(1,3), (2,5)] and out list [4,14]; the returned description was EC(ET(),[EA(),EP(2,1)]) in
a time of 0.031219 seconds (Figure 5.10), which can easily be confirmed to be an adequate descrip-
tion for this function, since it describes the product of the sum of the two elements of the pair with the
first one, which is exactly the behaviour of this function. This conclusion was corroborated by perform-
ing the test for input values (14,3) and (7,3), which outputed 238 and 70, respectively. Furthermore,
we performed another two attempts, with more and greater input values in order to compare results:
with [(3,1), (4,1), (0,6), (2,4)] as inp list and [12,20,0,12] as out list the returned description was the
same as the one returned in the last attempt in 0.007997 seconds, just like it happened with inp list
[(2,4), (10,2), (3,15), (20,98), (50,120)] and out list [12,120,54,2360,8500] the scientist also returned
the same description as the last attempt in 0.016994 seconds.
(a) Lists of inp and out (b) Description found and thetime it took in the bottom
(c) Code obtained
Figure 5.10: Results for the second attempt for the function f(x, y) = (x + y)x searched by the thirdscientist
68
Experiment 5.3.6. We will now try to see the scientist’s behaviour regarding the square function defined
as sq(x) = x2. At the beginning we defined inp list as [(2, )] and out list as [4], which resulted in the
output of the description EC(ES(),[ES()]) that describes the successor of the successor function. This
is clearly not the description we want to find, and so we proceeded to a second attempt. To do so, we
enlarged inp to [(2, ), (3, )] and out to [4,9]. For these input lists, the scientist returned in 0.046871
seconds the description EC(ET(),[EP(1,1),EP(1,1)]), which describes the product of the input value
with itself; this is the definition of the squared function, and so we are in conditions to conclude that this
description is a correct one.
Experiment 5.3.7. Next, we tried with the exponential function defined as exp(x, y) = xy. For the
first attempt, inp list was [(2,3)] and out list [8]; this resulted in the scientist returning the description
EBP(EP(2,1)) in 0.005502 seconds. The test with input pair (3,2) outputed 9 and the one with input
(5,3) returned 125, which are the adequate results for these inputs. Furthemore, this is the way we
defined the exponential function in Chapter 4 and so we concluded that this is an adequate description
for this function. Moreover, it was performed another attempt with more input values, some of them with
with great value. With inp list [(3,5), (9,3), (15,2), (20,6)] and out list [243,729,225,64000000], the result
was the same adequate description as before in 0.006994 seconds.
Experiment 5.3.8. Advancing to the factorial function, we began by defining the inp list as [(2, )] and
the out list as [2]; obviously this concluded in the scientist returning EP(1,1) which is not the description
we want to find. Then, we attempted with the lists [(2, ), (3, )] and [2,6]. The scientist outputed the
description EBP(ES()) in 0.015621 seconds, which we tested for inputs (6, ) and (8, ); the results of
these tests were 720 and 40320, which are correct. Besides, this description relates to the product of the
successor of every number smaller than the input value, which is the definition of the factorial function,
and so we are in conditions to conclude that this description is an adequate one for the function in hand.
Experiment 5.3.9. Regarding the binary max function, that given a pair of elements outputs the biggest
one between them, we started by providing the scientist with inp list [(3,4)] and out list [4]. It returned
the description for the projection of the second element of the pair EP(2,2), which is obviously not the
intended outcome. We then proceeded to append to inp list the pair (5,2) and to out list its correspon-
dent value 5. This resulted in the scientist returning the description EC(EA(),[EM(), EP(2,2)]) in a
computation that took 0.015625 seconds. This is an expression that describes the addition of the sub-
traction of the two elements of the input pair with the second one. If we go to Chapter 4, we see that this
is one of the ways to define the max function, and so we deduce that this description is a correct one for
this function, which is corroborated by the following tests: for input (45,20) it outputed 45 and for (4,10)the result was 10. We still performed another attempt with an inp list with more and greater values. Thus
we provided to the scientist inp list [(15,20), (136,59), (420,767), (520,10)] and out list [20,136,767,520],what resulted in the scientist returning the same appropriate description as before in 0.021094 seconds.
Experiment 5.3.10. For the binary min function, that given a pair of elements outputs the smallest one
between them, we began with inp list [(3,4)] and out list [3]; expectantly it returned the description for the
first element of the pair EP(2,1). By having as inp list [(3,4), (5,2)] and [3,2] as out list, we obtained, in
69
0.007996 seconds, the description EC(EM(),[EP(2,1),EM()]) which describes the subtraction between
the first element of the pair and the subtraction of both elements of the input pair. If we go to the
expression for the min function in Chapter 4, we see that this is exactly the expression used to define
this function, and so, with the help of the tests with input pairs (60,2) and (6,14), which returned 2 and
6 respectively, we conclude that the scientist found an appropriate description for this function.
Experiment 5.3.11. We proceed to the sg function as defined in Chapter 4. We start with inp list [(0, )]and out list [0], which obviously led the scientist to output the description for the unary projection function
EP(1,1), as well as it did when we provided inp list [(0, ), (1, )] and out list [0,1]. However, when the
scientist receives the lists [(0, ), (1, ), (2, )] and [0,1,1], it returns EBS(EBP(EP(1,1))) in virtually no time
(0.0 seconds), regarding which, when we test it for other values, we believe that it is a valid description
for this function, since it outputed 1 for every provided input different from 0. We then tried to see if there
was any computational prejudice by providing lists with more elements to the scientist: in this case, only
when we added to inp list big numbers (for example, 500) could we see differences in the computational
time, even if residual: it increased from 0.0 seconds only for 0.235158 seconds. We then saw what
happened for even bigger numbers, like 5000; it took 62.4937 seconds to terminate.
Experiment 5.3.12. The next function that we experimented with is the sg defined by the expression
sg(x) = 1.−sg(x). With inp list [(0, )] and out list [1], the scientist expectantly returned the description for
the successor ES(), which does not describe this function. When the inp and out lists were augmented
to [(0, ), (1, )] and [1,0], respectively, the returned description was EBP(EP(1,1)), which when tested for
other inputs always returned 0, except for input (0, ) for whom it returned 1, and so the scientist returned,
to the best of our knowledge, an adequate description for this function. This computation took 0.00997
seconds.
Experiment 5.3.13. We will now test the scientist for the distance function, which we were not able to
identify through the previous scientists (one of which we did not even try to do so). Our first attempt
was performed using inp list [(3,2)] and out list [1], which returned the description EM(); obviously this
expression does not describe the distance function since it is defined as the one for the subtraction
function. We then increased both lists to [(3,2), (1,6)] and [1,5]. This time, the expression given by the
scientist was EBS(EBS(EBP(ET()))), which we then tested for a few input pairs: for (2,1) it outputed 0
instead of 1, for (2,5) the result was 4 instead of 3 and with input (85,23) it returned 22 when it should
have returned 62. Next, we appended one of these pairs to the inp list, (2,5), and its respective output
3 to out list. This time, the search procedure did not went through the verification of the expression
EC(EBS(EBP(EBS(ES()))),[EBP(EA())]). The same happened for inp list [(3,2), (1,6), (2,1)] and out
list [1,5,1]. We suspected that changing the order of the pairs in inp list could result in a different
outcome, since the verification is performed following the order of the pairs in inp list, and so we tried
again with the same pairs but this time our inp list was [(3,2), (2,1), (1,6)] while our out list was [1,1,5];it made no change in the outcome. At this time, we used the strategy that at this point is no news:
our inp list became [(0,1), (0,2), (1,0), (1,1), (1,2), (2,0), (2,1), (2,2)] and the out list to the respective
correct output values, [1,2,1,0,1,2,1,0]. This time, the scientist needed 3.51511 seconds to return the
70
description EC(EA(),[EM(),EC(EM(),[EP(2,2),EP(2,1)])]), to which we proceeded to perform some
tests: to input (85,23) we now have the correct output of 62, for pair (12,7) the result was 5 as it should be
and when provided with (43,15) it resulted in the correct value of 28. In fact, by analyzing the expression
itself, we observe that this describes the addition operation of the subtraction of the first element of the
pair with the second with the result of the subtraction of the second element of the pair by the first, which
is exactly the expression we used to define this function back in Section 3.2, in Example 3.2.6, and thus
we have every reason to conclude that this description is a correct one for this function.
Experiment 5.3.14. Our last experiment regarding the behaviour of the scientist with simple arithmetic
functions will be performed with the function that performs the natural division of a number by 2, i.e. the
function defined by the expression f(x) = ⌊x2⌋. We began with inp list [(1, )] and out list 0. The result
here was the expression that describes the predecessor function, as expected. We then appended
values to the lists that would contradict this result, having now [(1, ), (3, )] as inp list and [0,1] as out list.
The scientist then returned the expression EBS(EBS(EP(1,1))), which when tested for input values (4, )and (5, ) wrongly returned 4 and 10, respectively, instead of the correct result of 2 for both cases. This
made us provide the lists [(1, ), (3, ), (4, )] and [0,1,2], which made the scientist return the expression
EC(ED(),[ES(), EBS(ES())]) in a computation that took 0.009613 seconds to complete. The tests
performed were the following: for input (6, ) the result was 3, when we provided (8, ) the scientist returned
4, with input (15, ) the output was 7 and when given (253, ) the scientist returned 126; all these results
are correct and so we have a strong suspicion of having found an adequate description for this function.
Since this was the scientist with the most promising results, we then proceeded to try it for what we
proposed ourselves to do: see if it can find expressions that describes natural laws.
Experiment 5.3.15. Lets suppose we are trying to find out the relation between the measurements we
performed of the electrical resistance, the current intensity and the potential difference in a section of an
electric circuit. We know that the resistance has 3Ω and it was measured the following pairs of values
for the current and the voltage: (1,3), (3,9), (4,12) and (6,18). Suppose we want to write the voltage in
order to the current (since the resistance is constant, for now we will not worry about giving its value to
the scientist); then we provide to the scientist the lists [(1, ), (3, ), (4, ), (6, )] as the inp list and [3,9,12,18]as the out list. The expression returned was EC(EC(EA(),[EA(),EP(2,1)]),[EP(1,1),EP(1,1)]) in a
computation that took 0.078901 seconds. On the other hand, if we try to write the current in order to the
voltage, the inp list would become [(3, ), (9, ), (12, ), (18, )] and the out list would be [1,3,4,6]; with these
lists, the scientist did not went beyond verifying the expression EC(EBS(EBP(EP(1,1))),[EBP(ES())]).
To try to overcome this problem, we added the resistance value to the pairs in the input list, in order
to facilitate the search. With this in mind, the inp list was [(3,3), (9,3), (12,3), (18,3)] and the out list
was [1,3,4,6], this resulted in a fast computation of 0.004992 seconds that returned the description
EC(ED(),[EP(2,2), EP(2,1)]), as it can be seen in Figure 5.3.15. So, to the best of our knowledge
since there is no more information, the scientist was able to find an expression for the relation between
the current, the voltage and the resistance between two points in an electric circuit. In fact, the second
expression the scientist found is the one that describes Ohm’s Law in the way we are used to see it:
71
I(V,R) = VR
.
(a) Lists of inp and out (b) Description found and the time it took in thebottom
(c) Code obtained
Figure 5.11: Results regarding Ohm’s Law searched by the third scientist
Experiment 5.3.16. We then tried to see if the scientist could find out the relation between the distance
of a planet to the sun and the period of its orbit (i.e. see if it can find the expression that describes
Kepler’s Law). To do so, we took the values present in [25]: our inp list is [(1, ), (4, ), (9, )] and our
out list is [1,8,27]. This resulted in the scientist not going beyond the verification of the expression
EC(EC(EBP(ES()),[EBP(ES())]),[EBS(ES())]); without being able to provide more data (for smaller
input numbers, preferably), there no more we can do in this case.
Experiment 5.3.17. Our next experiment was regarding the law of gravitation that relates the masses
of two bodies, the distance and the gravitational force that is in action between then, which is ex-
pressed by the formula F = Gm1m2
d2. Let us suppose that we are obtaining data from the bodies with
masses of 2 × 105kg and 9 × 105kg, respectively. Since we know a priori that the value for the grav-
itational constant G is 6.674 × 10−11m3kg−1s−2, for this masses the product Gm1m2 has the approxi-
mate value of 12, which would make the pairs (d,F ) as (1,12) and (2,3) for example. Thus, we pro-
vide the scientist inp list [(1, ), (2, )] and out list [12,3], what resulted in the return of the expression
EC(EC(EBS(EA()),[EBS(ES()),ES()]),[EC(ED(),[EP(1,1),ES()])]) in 473.198 seconds. We know
that, in this case, if this description is a correct one for this function, then for input (3, ) the result should
be 1 for the integer division of 12 by 33 = 9 is 1. Thus, the test performed with input (3, ) showed that
this description is not adequate, since the respective output was 3. Next, we tried to understand if with
this new pair of (d,F ) values, (3,1), the scientist would return a more accurate conjecture. In fact, this
time the scientist outputed EC(EBP(EBP(EA())),[ES(),EC(ED(),[EP(1,1),EC(ES(),[ES()])])]) in a
computation that took 1827.86 seconds. However, by testing for input (4, ) the output returned was 1
when it should have been 0 (once again due to performing the integer division of 12 by 16). We then
added this values to our lists, which resulted in the scientist not going beyond the verification of the
description EC(EC(EBS(ED()),[EP(1,1), EBP(EBP(ES()))]),[EC(ET(),[ES(), ES()])]). After this
result, we decided to not perform any more attempts for discovering an expression that describes this
function.
We present a summary of all the experiments in Table A.3.
72
5.4 Analysis
By observing the results obtained, we can retain that the time efficiency of the algorithms improves
with the upgrades we performed, be them the changes made in the listing of the primitive recursive
functions or the change in paradigm from searching among the primitive recursive functions to restricting
that search to the set of elementary functions. Furthermore, these modifications also allowed each
scientist to find a greater number of more complex functions than the previous ones. However, this
came with some setbacks since the fact that the improvements that allowed the scientists to find more
complex descriptions also sometimes prevented them to proceed with the search. This happens due
to the existence of expressions that describe hard to compute functions, which would make the search
algorithm stuck in the computation of the application of those functions to some input values, not allowing
the scientist to compare the actual result with the expected one. This happens because the concerned
functions are computed with nested for-loops that had to be executed a tremendous number of times,
for example. The fact that the first algorithm did not suffer from this harm is because the descriptions we
could reach with it were not of this hard to compute nature.
Looking at the great picture of the several experiments performed, we observe that the computational
times are generally small and the number of points needed for finding a description is reduced, i.e. we
have small locking sequences for the majority of our experiments. We also understood that the size for
these locking sequences depends not only on the concerned function but on the values given as input:
if the function that explained the relation between the values in inp and out lists could be described by
an expression present in the early stages of the descriptions’ listing, then a very small locking sequence
was obtained; if a function could only be described with an expression that appears at a more posterior
position in the listing of descriptions, a small locking sequence could only be achieved if the values
provided were not also explained by a description that appeared before in the enumeration but also if
these values were not big enough to cause the scientist to be stuck at the verification of some hard to
compute descriptions. These values sometimes were not that big: for example in the second Algorithm
inp list was composed with the pairs (1,3) and (2,3) and that was enough for the procedure not to go
beyond the verification of the description R(S(),R(P(2,1),R(P(3,1),C(R(Z(),P(2,2)),[P(5,4)])))).
Another pivotal point to understand the efficiency of the scientists is to observe that the order of the
elements in the inp list also had influence on the outcome of the search; if the tuples that are big enough
to cause the scientist to get stuck in the verification of a description are preceded by tuples that are easy
to compute and whose result is different than the respective ones in the out list, then the scientist will
overcome the verification of that description and move on through the search. Trying to force this input
tuples to appear first in the inp list can be done only with trial and error, since predict the outcome of a
situation like this is practically speculation and close to impossible due to the fact that it heavily depends
on the function we are trying to find.
On the other hand, the length of the lists provided is not that important to the efficiency of the
scientists as one could assume at the beginning, at least compared to the other factors we commented
previously. Bigger inp and out lists did in fact slow down the search but mostly because the values
73
provided were big enough to slow down the computation of the verification of some descriptions. We see
that when the provided values’ outputs are easily computed by the several possible descriptions listed
or if the initial tuples fail the verification right ahead, preventing the same computation and verification of
the following values, the increase in the time of computation, although existing, is residual.
This makes the efficiency of each scientist mainly depending on three factors: the position of the
adequate description on the enumeration lists, the magnitude of the values provided and the order in
which these elements are given, whilst the size of the lists (i.e. the number of points given) proved to
be less relevant to this question: as long as the values are small enough to allow the computation to
proceed, the time needed to find the description is not going to be significantly bigger than for smaller
but proper lists.
Regarding the code of the generated programs, we observed that sometimes they were constructed
with the use of several nested for-loops. If we recall a statement made in the beginning of Chapter 3,
we saw that the elementary functions were the ones who had a program that only needed a maximum
of two nested for-loops in its sequence of instructions. However, for some of the elementary functions
found, the resulting code had more than two nested for-loops, especially when performing the search
with the scientists implemented upon the primitive recursive functions. This does not mean that the
statement is wrong; it just means that the first description found that explained these functions was one
that generated a program of this sort. It is possible that if we kept searching we would find a description
located in a posterior position of the enumeration that would also explain the function in question and
whose constructed program would only have a maximum of two nested for-loops.
74
Chapter 6
Conclusions
6.1 Achievements
The major achievement of this work was the computational development of scientists that were able to
identify simple primitive recursive functions and/or elementary functions in a short amount of time. These
scientists not only identified these functions but they did so with very little information provided, resulting
in the discovery of very small locking sequences: for example, if we compare with the results obtained
in a similar work for finite automata in [29], the difference is astonishing since it was needed to provided
a great number of points (sometimes more than 30) for relatively simple automata to be identified1. A
possible explanation for this phenomenon is that, opposite to what happens in the automata case, the
primitive recursive functions have an underlying structure that, even for a small set of input tuples, cause
the respective outputs for different primitive recursive functions to be very distinct among themselves,
which makes it easier to find the correct functions. To better understand the behaviour of the scientists,
see the analysis of the experiments’ results in Section 5.4.
Regarding the relation between these results and the search for empirical laws, there are some as-
sumptions we need to do before we can draw any conclusions. First, we assume that the observations
measured are natural values, which is not true. However, this allows us to focus on the identification of
these laws through their form, i.e. their algebraic expressions, translated into descriptions of recursive
functions of the type N→ N. This comes with a cost, since we are ignoring the existence of experimental
errors, especially through the verification process of an hypothesis. We also assume that the expres-
sions that explain these laws have a certain underlying homogeneity to themselves, i.e. they are mainly
explained, for example, by continuous not piecewise functions. This was why we did not experimented
and tested the scientists with piecewise functions: the fact that they need more than one mathematical
expression to define it would make its identification much more complex and long, since they would be
explained a much longer description.2 It is this belief in the nature of the empirical laws that allow us to
assume that we are able to explain every natural law not only with functions in the set of the primitive1Point out that this was a much more simple work without the dimension and complexity of a dissertation. The full source code
of this project can be found in https://github.com/gamatos/gold.2To be precise, we did experiment with the functions sg and sg. However, they can also be easily expressed by a single and
unified algebraic expression.
75
recursive functions but even more narrowly only by functions in the class of elementary functions. If
this assumption is true, then the developed scientists can be considered to be “embryos” of scientists
that are actually able to identify relations of natural phenomena on their one, facilitating the scientific
discovery process that many times does not evolve because those relations are not found.
6.2 Future Work
In order for our “embryo” scientists to evolve to ones that can actually discover expressions that explain
the empirical laws, there are some improvements that can be performed, like the ones that follow:
• Improve the interface of the scientist, turning it more complex and user friendly.
• Execute these experiences on a computer with a greater processing capacity. This would allow not
only smaller computational times but also for the scientists to go further in their search, due to the
fact that a better processor would endow each scientist with a greater capability of not getting stuck
in the verification of the functions, since it would compute the so called hard-to-compute for-loops
much more efficiently. This way, we could even provide more points to the scientist without fear of
the search being blocked in the verification of a description, thus increasing the chances of finding
a description that would explain the relation between the input and output values.
• Reduce even more the redundancies in the enumeration of the descriptions. For example, the
descriptions C(P(2,1),[P(2,1),D]) and C(P(2,1),[P(2,1),F]), where D and F are binary de-
scriptions, are expressions that describe the same relations between input tuples and output val-
ues, and so they do not need to be both considered and tested. Furthermore, in this case you
do not even have to consider any of these descriptions to test if they explain the relation between
the inputs and the outputs, since they both are redundancies of a much more simple description,
P(2,1). If descriptions like those two are not considered in the enumeration, the search would
become much more efficient, especially if these descriptions are of the ones that imply a great
number of executions of for-loops.
• Define more functions to be in the basis of the rules that inductively construct the descriptions, like
it was done with the natural quotient function for the definition of descriptions for elementary func-
tions, defined with the symbol ED(). A great improvement in this matter would be the addition of
constants to this basis; this way, we could write natural numbers with size one descriptions instead
of needing to write those constants with nested compositions of the successor operation applied
to the zero constant, thus having expressions with smaller size that would describe functions with
natural numbers in their expressions.
• Change the verification step to take into account the existence of experimental errors. This can be
done by changing the notion of convergence to the one in Definition 2.2.8.
76
Bibliography
[1] J. Avigad. Notes on Recursive Functions. Unpublished. Revised and expanded by Zach, R.
[2] J. L. Bell and M. Machover. A Course in Mathematical Logic. Elsevier, 1977.
[3] E. Bilsland, L. Van Vliet, K. Williams, J. Feltham, M. P. Carrasco, W. L. Fotoran, E. F. Cubillos,
G. Wunderlich, M. Grøtli, F. Hollfelder, et al. Plasmodium dihydrofolate reductase is a second
enzyme target for the antimalarial action of triclosan. Scientific reports, 8(1):1038, 2018.
[4] L. Blum and M. Blum. Toward a mathematical theory of inductive inference. Information and Control,
28:125–155, 1975.
[5] J. Case. Infinitary self-reference in learning theory. Journal of Experimental & Theoretical Artificial
Intelligence, 6(1):3–16, 1994.
[6] J. Case. Algorithmic scientific inference. International Journal of Unconventional Computing, 8(3),
2012.
[7] J. Case and C. Smith. Anomaly hierarchies of mechanized inductive inference. In R. J. Lipton,
W. A. Burkhard, W. J. Savitch, E. P. Friedman, and A. V. Aho, editors, Proceedings of the 10th
Annual ACM Symposium on Theory of Computing, May 1-3, 1978, pages 314–319. ACM, San
Diego, California, USA, 1978.
[8] J. Case and C. Smith. Comparison of identification criteria for machine inductive inference. Theo-
retical Computer Science, 25(2):193–220, 1983.
[9] J. F. Costa. Unity of science as seen through the universal computer. IJUC, 13(1):59–81, 2017.
[10] J. F. Costa. On Discovering Scientific Laws. IJUC, 14(3–4):285–318, 2019.
[11] J. F. Costa and P. Gouveia. Computabilidade, Inferencia Indutiva, Complexidade. Draft of a book
to be submitted.
[12] N. Cutland. Computability: An introduction to recursive function theory. Cambridge university press,
1980.
[13] W. Ewert. (https://.stackexchange.com/users/1343/winston-ewert). Enumerating
the primitive recursive functions. Software Engineering Stack Exchange. URL:
https://softwareengineering.stackexchange.com/a/310061 (version: 2016-02-13).
77
[14] M. Gladstone. A reduction of the recursion scheme. The Journal of Symbolic Logic, 32(4):505–508,
1968.
[15] M. Gladstone. Simplifications of the recursion scheme. The Journal of Symbolic Logic, 36(4):653–
665, 1971.
[16] E. M. Gold. Language identification in the limit. Information and control, 10(5):447–474, 1967.
[17] K. Gurney. An Introduction to Neural Networks. Taylor & Francis, Inc., Bristol, PA, USA, 1997.
[18] W. G. Handley and S. S. Wainer. Complexity of primitive recursion. In U. Berger and H. Schwicht-
enberg, editors, Computational Logic, pages 273–300, Berlin, Heidelberg, 1999. Springer Berlin
Heidelberg.
[19] S. Jain, D. N. Osherson, J. S. Royer, and A. Sharma. Systems That Learn. An Introduction to
Learning Theory. The MIT Press, second edition, 1999.
[20] S. Kahrs. The primitive recursive functions are recursively enumerable. University of Kent at Can-
terbury, Department of Computer Science X, 200, 01 2008.
[21] K. T. Kelly. The Logic of Reliable Inquiry. OUP USA, 1996.
[22] J. G. Kemeny. A Philosopher Looks at Science. Van Nostrand, 1959.
[23] R. D. King, J. Rowland, S. G. Oliver, M. Young, W. Aubrey, E. Byrne, M. Liakata, M. Markham, P. Pir,
L. N. Soldatova, et al. The automation of science. Science, 324(5923):85–89, 2009.
[24] H. Kitano. Artificial intelligence to win the nobel prize and beyond: Creating the engine for scientific
discovery. AI magazine, 37(1):39–49, 2016.
[25] P. Langley, H. A. Simon, G. L. Bradshaw, and J. M. Zytkow. Scientific Discovery: Computational
Explorations of the Creative Process. MIT Press, Cambridge, MA, USA, 1987.
[26] S. Liu. An enumeration of the primitive recursive functions without repetition. Tohoku Math. J. (2),
12(3):400–402, 1960.
[27] M. Lobao. Identifying Empirical Laws. Master’s thesis, Instituto Superior Tecnico, 2016.
[28] E. Martin and D. N. Osherson. Elements of Scientific Inquiry. MIT Press, 1998.
[29] G. Matos. An Exhaustive Algorithm for Minimum State Automaton Identification. Graduation Project,
Instituto Superior Tecnico, 2019.
[30] A. R. Meyer and D. M. Ritchie. The complexity of loop programs. In Proceedings of the 1967 22nd
national conference, pages 465–469. ACM, 1967.
[31] P. G. Odifreddi. Classical Recursion Theory: Volume II, volume 143 of Studies in Logic and The
Foundations of Mathematics. Elsevier Science B.V., 1999.
78
[32] D. N. Osherson, M. Stob, and S. Weinstein. Systems That Learn: An Introduction to Learning
Theory for Cognitive and Computer Scientists. The MIT Press, 2nd edition, 1986.
[33] R. Reis. Automatos finitos: manipulacao, geracao e contagem. PhD thesis, Faculdade de Ciencias
da Universidade do Porto, 2007.
[34] H. E. Rose. Subrecursion: Functions and Hierarchies. Clarendon Press, Oxford, 1984.
[35] A. Sernadas, M. C. S. Sernadas, and J. Ramos. Computability and Complexity: A Mathematical
Primer. College Publications, 2018.
[36] M. C. S. Sernadas. Introducao a Teoria da Computacao. Editorial Presenca, 1993.
[37] A. Sparkes, W. Aubrey, E. Byrne, A. Clare, M. Khan, M. Liakata, M. Magdalena, J. Rowland,
L. Soldatova, K. Whelan, M. Young, and R. King. Toward robot scientists for autonomous scientific
discovery. volume 2, 01 2010.
[38] M. P. Szudzik. The computable universe hypothesis. In A Computable Universe: Understanding
and Exploring Nature as Computation, pages 479–523. World Scientific, 2013.
79
80
Appendix A
Functions tested and summarized
results
A.1 List of functions tested with the scientists
• id(x) = x
• s(x) = x + 1
• pred(x) = x .− 1
• zero(x) = 0
• sx(x, y) = x + 1
• add(x, y) = x + y
• sub(x, y) = x .− y
• prod(x, y) = x × y
• f(x) = 2x
• f(x, y) = (x + y) .− 1
• f(x, y) = (x + y)x
• dist(x, y) = ∣x − y∣
• exp(x, y) = xy
• sq(x) = x2
• fact(x) = x!
• max(x, y) = (x .− y) + y
81
• min(x, y) = x .− (x .− y)
• sg(x) = x .− (x .− 1)
• sg(x) = 1.− sg(x)
• half(x) = ⌊x2⌋
• Ohm’s Law: I = VR
• Kepler’s Law: k = D3
P 2
• Gravitational Law: F = Gm1m2
d2
A.2 Summarized results
We present three tables with the results of the tests performed by each scientist, respectively.
82
Func
tion
inp
List
out
List
Des
crip
tion
Ade
quat
eD
escr
iptio
nTi
me
(sec
)id
(x)=
x[(
1,)
][1
]P(1,1)
Yes
0.0
03996
s(x)=
x+
1[(
1,)
][2
]S()
Yes
0.0
00996
pred(x
)=x
. −1
[(1,)
][0
]R(Z(),P(2,1))
Yes
0.0
04997
zero(x)=
0[(
2,)
][0
]R(Z(),P(2,2))
Yes
0.0
03999
s x(x,y
)=x+
1[(
2,4)
][3
]C(S(),[P(2,1)])
Yes
0.0
01956
add(x,y
)=x+y
[(2,
3)]
[5]
C(S(),[C(S(),[P(2,2)])])
No
0.0
04942
add(x,y
)=x+y
[(2,
3),(
1,5)
][5,6
]R(P(1,1),C(S(),[P(3,3)]))
Yes
0.0
12939
add(x,y
)=x+y
[(1,4
),(2,1
),(3,2
),(0,6
)][5,3,5,6
]R(P(1,1),C(S(),[P(3,3)]))
Yes
0.0
15591
add(x,y
)=x+y
[(13,2
4),
(35,4
1),
(133,2
56),
(420,5
13)]
[37,
76,3
89,9
33]
R(P(1,1),C(S(),[P(3,3)]))
Yes
0.0
67953
sub(x,y
)=x
. −y
[(5,
2)]
[3]
C(S(),[P(2,2)])
No
0.0
01001
sub(x,y
)=x
. −y
[(5,
2),(
2,0)
][3,2
]R(P(1,1),C(S(),[C(S(),[P(3,2)])]))
No
0.4
41712
sub(x,y
)=x
. −y
[(5,
2),(
2,0),(
4,1)]
[3,2,3
]R(P(1,1),C(R(Z(),P(2,1)),[P(3,3)]))
Yes
0.4
32695
sub(x,y
)=x
. −y
[(5,
2),(
4,1)
][3,3
]R(P(1,1),C(R(Z(),P(2,1)),[P(3,3)]))
Yes
0.4
95662
sub(x,y
)=x
. −y
[(34,1
2),(
25,4
0),(
151,7
2),(
627,7
28)]
[22,0,7
9,0
]R(P(1,1),C(R(Z(),P(2,1)),[P(3,3)]))
Yes
4.57001
prod(x,y
)=x×y
[(2,
3)]
[6]
R(S(),C(S(),[P(3,3)]))
No
0.0
14799
prod(x,y
)=x×y
[(2,
3),(
5,2)
][6,1
0]
R(R(Z(),P(2,2)),C(R(P(1,1),C(S(),[P(3,3)])),[P(3,1),P(3,3)]))
Yes
8495.
14
f(x
)=2x
[(0,)
][0
]P(1,1)
No
0.0
07997
f(x
)=2x
[(0,
),(1,)
][0,2
]R(Z(),C(S(),[C(S(),[P(2,1)])])))
No
0.6
12423
f(x
)=2x
[(0,
),(1,),(
2,)
][0,2,4
]R(Z(),C(S(),[C(S(),[P(2,2)])])))
Yes
0.6
58629
f(x
)=2x
[(3,)
][6
]C(S(),[C(S(),[S()])])
No
0.0
14991
f(x
)=2x
[(6,)
][1
2]
R(Z(),C(S(),[C(S(),[P(2,2)])])))
Yes
0.7
30134
f(x,y
)=(x
+y)
. −1
[(1,
2)]
[2]
P(2,2)
No
0.0
01000
f(x,y
)=(x
+y)
. −1
[(1,
2),(
2,1)
][2,2
]R(R(Z(),P(2,1)),C(S(),[P(3,3)]))
Yes
0.5
95186
f(x,y
)=(x
+y)
. −1
[(19,8
)][2
6]
R(R(Z(),P(2,1)),C(S(),[P(3,3)]))
Yes
0.8
25140
f(x,y
)=(x
+y)
. −1
[(2,0
),(5,4
),(2,2
),(1,3
)][1,8,3,3
]R(R(Z(),P(2,1)),C(S(),[P(3,3)]))
Yes
0.61178
f(x,y
)=(x
+y)
. −1
[(25,3
1),
(48,5
7),
(237,1
92),
(540,3
71)]
[55,1
04,4
28,9
10]
R(R(Z(),P(2,1)),C(S(),[P(3,3)]))
Yes
2.78675
Tabl
eA
.1:
Sum
mar
yof
the
expe
rimen
tsm
ade
byth
esc
ient
istr
elat
edto
the
first
algo
rithm
83
Functioninp
Listout
ListD
escriptionFound
Adequate
Time
(sec)add(x
,y)=x+y
[(2,3)][5]
C(C(S(),[S()]),[P(2,2)])
No
0.003998
add(x
,y)=x+y
[(2,3),(1,5)][5,6]
R(P(1,1),C(S(),[P(3,3)]))
Yes0.002997
sub(x
,y)=x
.−y
[(5,2)][3]
C(S(),[P(2,2)])
No
0.005182
sub(x
,y)=x
.−y
[(5,2),(2,0)][3,2]
R(P(1,1),R(P(2,1),P(4,3)))
No
0.031130
sub(x
,y)=x
.−y
[(5,2),(2,0),(4,7)][3,2,0]
R(P(1,1),C(R(Z(),P(2,1)),[P(3,3)]))
Yes0.041919
sub(x
,y)=x
.−y
[(5,2),(4,7)][3,0]
R(P(1,1),C(R(Z(),P(2,1)),[P(3,3)]))
Yes0.042063
sub(x
,y)=x
.−y
[(2,1),(3,6),(4,0),(5,2)][1,0,4,3]
R(P(1,1),C(R(Z(),P(2,1)),[P(3,3)]))
Yes0.031919
sub(x
,y)=x
.−y
[(20,1
0),(15,7),(34,57),(60
,61)][10
,8,0,0]R(P(1,1),C(R(Z(),P(2,1)),[P(3,3)]))
Yes0.578816
prod(x
,y)=x×y
[(2,3)][6]
R(P(1,1),C(R(Z(),P(2,1)),[P(3,3)]))
No
0.005001
prod(x
,y)=x×y
[(2,3),(5,2)][6,10]
R(R(Z(),P(2,2)),R(P(2,1),C(S(),[P(4,4)])))
Yes1.124
52
prod(x
,y)=x×y
[(5,0),(2,3),(4,3),(6,3)][0,6,12
,18]R(R(Z(),P(2,2)),R(P(2,1),C(S(),[P(4,4)])))
Yes3.423
13
prod(x
,y)=x×y
[(7,12),(30
,14),(126,73),(2
56,42
1)][8
4,420,91
98,107776]stuck
inR(S(),R(P(2,1),R(P(3,1),C(S(),[P(5,5)]))))
——
f(x)=2x
[(2,)]
[4]C(S(),[S()])
No
0.011992
f(x)=2x
[(2,),(3,)][4,6]
R(Z(),C(S(),[C(S(),[P(2,2)])])))
Yes0.030945
f(x)=2x
[(5,)]
[10]R(Z(),C(S(),[C(S(),[P(2,2)])])))
Yes0.040974
f(x,y)=
(x+y)
.−1
[(1,2)][2]
P(2,2)
No
0.003001
f(x,y)=
(x+y)
.−1
[(1,2),(2,1)][2,2]
C(S(),[R(P(1,1),R(P(2,1),P(4,3)))])
No
0.024993
f(x,y)=
(x+y)
.−1
[(1,2),(2,1),(2,4)][2,2,5]
R(R(Z(),P(2,1)),C(S(),[P(3,3)]))
Yes0.071959
f(x,y)=
(x+y)
.−1
[(13,6)]
[18]R(R(Z(),P(2,1)),C(S(),[P(3,3)]))
Yes0.141918
f(x,y)=
(x+y)
.−1
[(2,6),(3,0),(4,2),(1,5)][7,2,5,5]
R(R(Z(),P(2,1)),C(S(),[P(3,3)]))
Yes0.215976
f(x,y)=
(x+y)
.−1
[(16,2
4),(73,51),(1
27,245),(3
18,1
82)][39
,123,371,49
9]R(R(Z(),P(2,1)),C(S(),[P(3,3)]))
Yes2.172
63
f(x,y)=
(x+y)x
[(3,1)][12]
R(S(),R(P(2,2),R(P(3,1),C(S(),[P(5,5)]))))
No
1.26955
f(x,y)=
(x+y)x
[(3,1),(2,4)][12,12]
stuckin
C(S(),[R(S(),R(P(2,2),R(P(3,3),C(C(S(),[S()]),[P(5,5)]))))])
——
f(x,y)=
(x+y)x
[(0,1),(1
,0),(1,2),(2,1),(0,2),(2,0),(2,2)]
[0,1,3,6,0
,4,8]stuck
inR(P(1,1),R(P(2,1),R(P(3,1),R(P(4,1),R(P(5,1),C(S(),[P(7,7)]))))))
——
f(x,y)=
(x+y)x
[(0,1),(0
,2),(1,0),(1,1),(1,2),(2,0),(2,1),(2,2)]
[0,0,1,2,3
,4,6,8]
——
∼24
hoursdist(x
,y)=∣x−y∣
[(4,3)][1]
R(P(1,1),R(P(2,1),P(4,3)))
No
0.003997
dist(x
,y)=∣x−y∣
[(4,3),(2,6)][1,4]
C(R(Z(),P(2,1)),[R(S(),P(3,2))])
No
0.021986
dist(x
,y)=∣x−y∣
[(4,3),(2,6),(1,3)][1,4,2]
C(S(),[R(S(),R(R(P(1,1),P(3,2)),R(P(3,3),P(5,4))))])
No
17.2491
dist(x
,y)=∣x−y∣
[(4,3),(2,6),(1,3),(3,7)][1,4,2,4]
C(S(),[R(S(),R(R(P(1,1),P(3,2)),R(P(3,3),P(5,4))))])
No
17.8328
dist(x
,y)=∣x−y∣
[(4,3),(2,6),(1,3),(3,7),(10
,2)][1,4,2,4
,8]stuck
inR(S(),R(P(2,1),R(P(3,1),C(C(S(),[S()]),[P(5,5)]))))
——
dist(x
,y)=∣x−y∣
[(0,1),(0
,2),(1,0),(1,1),(1,2),(2,0),(2,1),(2,2)]
[1,2,1,0,1
,2,1,0]
C(R(S(),P(3,1)),[R(P(1,1),R(P(2,2),P(4,3))),
P(2,1)])
No
2.81838
dist(x
,y)=∣x−y∣
[(0,1),(0,2),(1
,0),(1,1),(1,2),(2,0),(2,1),(2,2),(2
,6)][1,2,1
,0,1,2,1,0,4]
R(P(1,1),C(R(S(),R(P(2,2),P(4,3))),[P(3,2),
P(3,1)]))
No
7.88448
dist(x
,y)=∣x−y∣
[(0,1),(0,2),(1
,0),(1,1),(1,2),(2,0),(2,1),(2,2),(2,6),(3
,4)][1,2,1
,0,1,2,1,0,4,1]
R(P(1,1),C(R(S(),R(P(2,2),P(4,3))),[P(3,2),P(3,1)]))
No
30.5075
dist(x)=
∣x−y∣
[(0,1),(0,2),(1,0),(1
,1),(1,2),(2,0),(2,1),(2,2),(2,6),(3
,4),(6,2)][1,2,1,0
,1,2,1,0,4,1
,4]—
—∼
24hours
exp(x
,y)=xy
[(2,3)][8]
R(P(1,1),C(C(S(),[S()]),[P(3,3)]))
No
7.88448
exp(x
,y)=xy
[(1,3),(2,3)][1,8]
stuckin
R(S(),R(P(2,1),R(P(3,1),C(R(Z(),P(2,2)),[P(5,4)]))))
——
exp(x
,y)=xy
[(0,1),(0
,2),(1,0),(1,1),(1,2),(2,0),(2,1),(2,2)]
[0,0,1,1,1
,1,2,4]
C(R(Z(),R(S(),P(3,1))),[R(P(1,1),R(P(2,1),R(P(3,3),C(S(),[P(5,5)]))))])
No
13770.8
exp(x
,y)=xy
[(0,1),(0,2),(1
,0),(1,1),(1,2),(2,0),(2,1),(2,2),(4
,1)][0,0,1
,1,1,1,2,4,4]
——
∼24
hours
TableA
.2:S
umm
aryofthe
experiments
made
bythe
scientistrelatedto
thesecond
algorithm
84
Func
tion
inp
List
out
List
Des
crip
tion
Foun
dA
dequ
ate
Tim
e(s
ec)
pred(x
)=x
. −1
[(0,)
][0
]EP(1,1)
No
0.0
312
47
pred(x
)=x
. −1
[(0,
),(1,)
][0,0
]EBS(EP(1,1))
No
0.0
156
45
pred(x
)=x
. −1
[(0,
),(1,),(
2,)]
[0,0,1
]EBS(EP(1,1))
No
0.0
156
26
pred(x
)=x
. −1
[(0,
),(1,),(
2,),
(3,)
][0,0,1,2
]EBS(EBS(EBP(EP(1,1))))
Yes
0.0
156
31
pred(x
)=x
. −1
[(1,),(
3,),(
5,),
(15,)
][0,2,4,1
4]
EBS(EBS(EBP(EP(1,1))))
Yes
0.0
119
96
pred(x
)=x
. −1
[(1,
),(3,),(
5,),(
15,),(
260,)
][0,2,4,1
4,2
59]
EBS(EBS(EBP(EP(1,1))))
Yes
2.74
342
zero(x)=
0[(
3,)
][0
]EBP(EP(1,1))
No
0.0
625
00
zero(x)=
0[(
0,),
(3,)
][0,0
]EC(EBP(EP(1,1)),[ES()])
Yes
0.0
312
48
f(x
)=2x
[(1,)
][2
]ES()
No
0.0
f(x
)=2x
[(1,
),(2,)
][2,4
]EC(ES(),[EBS(ES())])
No
0.0
f(x
)=2x
[(1,
),(2,),(
3,)]
[2,4,6
]EC(EA(),[EP(1,1),EP(1,1)])
Yes
0.0
156
24
f(x,y
)=(x
+y)
. −1
[(2,
3)]
[4]
EC(ES(),[EP(2,2)])
No
0.0
f(x,y
)=(x
+y)
. −1
[(2,
3),(
1,4)
][4,4
]EBS(EBP(EBP(EP(2,1))))
No
0.0
f(x,y
)=(x
+y)
. −1
[(2,
3),(
1,4),(
3,1)
][4,4,3
]EC(EBS(EBS(EBP(EP(1,1)))),[EA()])
Yes
0.0
937
45
f(x,y
)=(x
+y)x
[(1,
3)]
[4]
EA()
No
0.0
f(x,y
)=(x
+y)x
[(1,
3),(
2,5)
][4,1
4]
EC(ET(),[EA(),EP(2,2)])
Yes
0.0
312
19
f(x,y
)=(x
+y)x
[(3,1
),(4,1
),(0,6
),(2,4
)][1
2,2
0,0,1
2]
EC(ET(),[EA(),EP(2,2)])
Yes
0.0
079
97
f(x,y
)=(x
+y)x
[(2,
4),(
10,2
),(3,1
5),(
20,9
8),
(50,
120)]
[12,1
20,
54,2
360,8
500]
EC(ET(),[EA(),EP(2,2)])
Yes
0.0
169
94
sq(x
)=x2
[(2,)
][4
]EC(ES(),[ES()])
No
0.0
312
49
sq(x
)=x2
[(2,
),(3,)
][4,9
]EC(ET(),[EP(1,1),EP(1,1)])
Yes
0.0
468
71
exp(x,y
)=xy
[(2,
3)]
[8]
EBP(EP(2,1))
Yes
0.0
055
02
exp(x,y
)=xy
[(3,
5),
(9,3
),(1
5,2
),(2
0,6
)][[
243,
729,2
25,6
4000
000
]]EBP(EP(2,1))
Yes
0.0
069
94
fact(x
)=x
![(
2,)
][2
]EP(1,1)
No
0.0
fact(x
)=x
![(
2,),
(3,)
][2,6
]EBP(ES())
Yes
0.0
156
25
max
(x,y
)[(
3,4)
][4
]EP(2,2)
No
0.0
max
(x,y
)[(
3,4),(
5,2)
][4,5
]EC(EA(),[EM(),EP(2,2)])
Yes
0.0
156
25
max
(x,y
)[(
15,2
0),
(136,5
9),
(420,
767),
(520,1
0)]
[20,
136,7
67,5
20]
EC(EA(),[EM(),EP(2,2)])
Yes
0.0
210
94
min(x,y
)[(
3,4)
][3
]EP(2,1)
No
0.0
040
02
min(x,y
)[(
3,4),(
5,2)
][3,2
]EC(EM(),[EP(2,1),EM()])
Yes
0.0
079
96
sg(x
)[(
0,)
][0
]EP(1,1)
No
0.0
300
00
sg(x
)[(
0,),
(2,)
][0,1
]EBS(EP(1,1))
No
0.0
sg(x
)[(
0,),
(2,),(
3,)]
[0,1,1
]EBS(EBP(EP(1,1)))
Yes
0.0
sg(x
)[(
0,),(
2,),
(3,),(
5,),
(500,)
][0,1,1,1,1
]EBS(EBP(EP(1,1)))
Yes
0.2
351
58
sg(x
)[(
0,),
(2,),(
3,),(
5,),(
5000,)
][0,1,1,1,1
]EBS(EBP(EP(1,1)))
Yes
62.4
937
sg(x
)[(
0,)
][1
]ES()
No
0.0
009
99
sg(x
)[(
0,),
(1,)
][1,0
]EBP(EP(1,1))
Yes
0.0
009
97
dist(x,y
)=∣x−y∣
[(3,
2)]
[1]
EM()
No
0.0
039
82
dist(x,y
)=∣x−y∣
[(3,
2),(
1,6)
][1,5
]EBS(EBS(EBP(ET())))
No
0.0
156
27
dist(x,y
)=∣x−y∣
[(3,
2),(
1,6),(
2,5)
][1,5,3
]st
uck
inEC(EBS(EBP(EBS(ES()))),[EBP(EA())])
No
—dist(x,y
)=∣x−y∣
[(3,
2),(
1,6),(
2,1)
][1,5,1
]st
uck
inEC(EBS(EBP(EBS(ES()))),[EBP(EA())])
No
—dist(x,y
)=∣x−y∣
[(3,
2),(
2,1),(
1,6)
][1,1,5
]st
uck
inEC(EBS(EBP(EBS(ES()))),[EBP(EA())])
No
—dist(x,y
)=∣x−y∣
[(0,
1),
(0,2
),(1,0
),(1,1
),(1,2
),(2,0
),(2,1
),(2,2
)][1,2,1,0,1,2,1,0
]EC(EA(),[EM(),EC(EM(),[EP(2,2),EP(2,1)])])
Yes
3.51
511
half
(x)=
⌊x 2⌋
[(1,)
][0
]EBS(EP(1,1))
No
0.0
050
00
half
(x)=
⌊x 2⌋
[(1,
),(3,)
][0,1
]EBS(EBS(EP(1,1)))
No
0.0
058
36
half
(x)=
⌊x 2⌋
[(1,
),(2,),(
3,)]
[0,1,1
]EC(ED(),[ES(),EBS(ES())])
Yes
0.0
096
13
Ohm
’sLa
w[(
1,),
(3,),(
4,),
(6,)
][3,9,1
2,1
8]
EC(EC(EA(),[EA(),EP(2,1)]),[EP(1,1),EP(1,1)])
Yes
0.0
789
01
Ohm
’sLa
w[(
3,),
(9,),(
12,),(
18,
)][1,3,4,6
]st
uck
inEC(EBS(EBP(EP(1,1))),[EBP(ES())])
——
Ohm
’sLa
w[(
3,3),
(9,3
),(1
2,3
),(1
8,3
)][1,3,4,6
]st
uck
inEC(ED(),[EP(2,2),
EP(2,1)])
Yes
0.0
120
39
Kep
ler’s
Law
[(1,
),(4,),(
9,)]
[1,8,2
7]
stuc
kin
EC(EC(EBP(ES()),[EBP(ES())]),[EBS(ES())])
——
Gra
vita
tiona
lLaw
[(1,
),(2,)
][1
2,3
]EC(EC(EBS(EA()),[EBS(ES()),ES()]),[EC(ED(),[EP(1,1),ES()])])
No
473.
198
Gra
vita
tiona
lLaw
[(1,
),(2,),(
3,)]
[12,
3,1
]EC(EBP(EBP(EA())),[ES(),EC(ED(),[EP(1,1),EC(ES(),[ES()])])])
No
1827.8
6
Gra
vita
tiona
lLaw
[(1,
),(2,),(
3,),
(4,)
][1
2,3,1,0
]st
uck
inEC(EC(EBS(ED()),[EP(1,1),
EBP(EBP(ES()))]),[EC(ET(),[ES(),
ES()])])
——
Tabl
eA
.3:
Sum
mar
yof
the
expe
rimen
tsm
ade
byth
esc
ient
istr
elat
edto
the
third
algo
rithm
85
86
Appendix B
Implementation of the Algorithms
We now explain the implementation of the algorithms related to the developed scientists and the re-
spective Python files in which that implementation is made, present at the link https://github.com/
brunomcpatricio/AutomatedScientists.
In folder PRF, we have the implementation of the the symbols Z, S, P, C and R (i.e. the symbols used at
Definition 3.2.1 to define the inductive construction of a description for the primitive recursive functions)
as Python classes in file classesprf.py. This file is used for the implementation the enumeration
procedure described in Algorithm 7 realized in file myenum1.py, which was then used to the search
procedure regarding Algorithm 6, implemented in file search1.py, which also has the implementation of
the procedure that will write the code of a program in Python that computes the function described by the
given function and return it in a .txt file. The file classesprf.py is also needed for the implementation
of the second enumeration procedure for primitive recursive functions (algorithm 9) performed in the file
myenum2.py, which will then be needed for the implementation of the search Algorithm 8 done in the
file search2.py. This file also has implemented the procedure that will write the code of a program in
Python language from a given description and return it in a .txt file.
Folder El is reserved for the files needed to implement the scientist for the set of the elementary
functions. The class implementation of the symbols that make up the set of inductive rules used in Defi-
nition 4.2.1 to construct a description for the elementary functions is executed in the file classesel.py.
The enumeration of the set of functions E described in Algorithm 13 is implemented in file myenumel.py,
while the search procedure in Algorithm 12 is implemented in searchel.py, where it is also implemented
the procedure that will receive a description and write the code of a program in Python language that
computes the function described by the given description and return it in a .txt file.
Lastly, there is also a README.txt file in order for the user to know how to work with the software.
87