arXiv:1411.0440v8 [cs.AI] 19 Apr 2020 · 2020. 4. 21. · arXiv:1411.0440v8 [cs.AI] 19 Apr 2020...

68
arXiv:1411.0440v8 [cs.AI] 19 Apr 2020 Noname manuscript No. (will be inserted by the editor) Modelling Serendipity in a Computational Context Joseph Corneli · Anna Jordanous · Christian Guckelsberger · Alison Pease · Simon Colton Received: date / Accepted: date Abstract The term serendipity describes a creative process that develops, in context, with the active participation of a creative agent, but not en- tirely within that agent’s control. While a system cannot be made to perform serendipitously on demand, we argue that its serendipity potential can be in- creased by means of a suitable system architecture and other design choices. We distil a unified description of serendipitous occurrences from historical theorisations of serendipity and creativity. This takes the form of a frame- work with six phases: perception, attention, interest, explanation, bridge, and valuation. We then use this framework to organise a survey of literature in cog- nitive science, philosophy, and computing, which yields practical definitions of This research was supported by the Engineering and Physical Sciences Research Council through grants EP/L00206X, EP/J004049, EP/L015846/1, and EP/K040251/2, with addi- tional workshop support through EP/N014758/1; by the Future and Emerging Technologies (FET) programme within the Seventh Framework Programme for Research of the European Commission, under FET-Open Grant numbers: 611553 (COINVENT) and 611560 (WHIM); and by workshop and symposium support from the Society for the study of Artificial Intel- ligence and Simulation of Behaviour(AISB). J. Corneli Hyperreal Enterprises, Ltd. E-mail: [email protected] ORCID: 0000-0003-1330-4698 A. Jordanous School of Computing, University of Kent, Chatham Maritime ME4 4AG, UK. E-mail: [email protected] (Corresponding author) ORCID: 0000-0003-2076-8642 C. Guckelsburger; S. Colton School of Electronic Engineering and Computer Science, Queen Mary University of Lon- don, Mile End Road, London E1 4NS, UK. E-mail: [email protected]; [email protected] ORCID: 0000-0003-1977-1887 (CG); 0000-0003-3377-1680 (SC) A. Pease School of Science & Engineering, University of Dundee, Dundee DD1 4HN, UK. E-mail: [email protected] ORCID: 0000-0003-1856-9599

Transcript of arXiv:1411.0440v8 [cs.AI] 19 Apr 2020 · 2020. 4. 21. · arXiv:1411.0440v8 [cs.AI] 19 Apr 2020...

  • arX

    iv:1

    411.

    0440

    v8 [

    cs.A

    I] 1

    9 A

    pr 2

    020

    Noname manuscript No.(will be inserted by the editor)

    Modelling Serendipity in a Computational Context

    Joseph Corneli · Anna Jordanous ·Christian Guckelsberger · Alison Pease ·Simon Colton

    Received: date / Accepted: date

    Abstract The term serendipity describes a creative process that develops,in context, with the active participation of a creative agent, but not en-tirely within that agent’s control. While a system cannot be made to performserendipitously on demand, we argue that its serendipity potential can be in-creased by means of a suitable system architecture and other design choices.We distil a unified description of serendipitous occurrences from historicaltheorisations of serendipity and creativity. This takes the form of a frame-work with six phases: perception, attention, interest, explanation, bridge, andvaluation. We then use this framework to organise a survey of literature in cog-nitive science, philosophy, and computing, which yields practical definitions of

    This research was supported by the Engineering and Physical Sciences Research Councilthrough grants EP/L00206X, EP/J004049, EP/L015846/1, and EP/K040251/2, with addi-tional workshop support through EP/N014758/1; by the Future and Emerging Technologies(FET) programme within the Seventh Framework Programme for Research of the EuropeanCommission, under FET-Open Grant numbers: 611553 (COINVENT) and 611560 (WHIM);and by workshop and symposium support from the Society for the study of Artificial Intel-ligence and Simulation of Behaviour(AISB).

    J. CorneliHyperreal Enterprises, Ltd. E-mail: [email protected]: 0000-0003-1330-4698

    A. JordanousSchool of Computing, University of Kent, Chatham Maritime ME4 4AG, UK. E-mail:[email protected] (Corresponding author)ORCID: 0000-0003-2076-8642

    C. Guckelsburger; S. ColtonSchool of Electronic Engineering and Computer Science, Queen Mary University of Lon-don, Mile End Road, London E1 4NS, UK. E-mail: [email protected];[email protected]: 0000-0003-1977-1887 (CG); 0000-0003-3377-1680 (SC)

    A. PeaseSchool of Science & Engineering, University of Dundee, Dundee DD1 4HN, UK. E-mail:[email protected]: 0000-0003-1856-9599

    http://arxiv.org/abs/1411.0440v8

  • 2 Joseph Corneli et al.

    the six phases, along with heuristics for implementation. We use the resultingmodel to evaluate the serendipity potential of four existing systems developedby others, and two systems previously developed by two of the authors. Mostexisting research that considers serendipity in a computing context deals withserendipity as a service; here we relate theories of serendipity to the develop-ment of autonomous systems and computational creativity practice. We arguethat serendipity is not teleologically blind, and outline representative direc-tions for future applications of our model. We conclude that it is feasible toequip computational systems with the potential for serendipity, and that thiscould be beneficial in varied computational creativity/AI applications, partic-ularly those designed to operate responsively in real-world contexts.

    Keywords Serendipity · Discovery Systems · Automated Programming ·Recommender Systems · Computational Creativity · Autonomous Systems

    1 Introduction

    Serendipity has played a role in many human discoveries: often-cited exam-ples range from vulcanized rubber, the Velcro™ strip, and 3M’s ubiquitousPost-it® Notes, through to penicillin, LSD, and Viagra®. An improved un-derstanding of serendipity could help bring about (computationally) creativebreakthroughs in these areas.

    Given its crucial role in human discovery and invention, it is not surpris-ing that the concept of serendipity has been adopted for users’ benefit bymany research areas such as computational creativity (Pease et al., 2013), in-formation retrieval (Toms, 2000; André et al., 2009b), recommender systems(Kotkov et al., 2016; Zhang et al., 2011), creativity support tools Maxwell et al.(2012) and planning (Muscettola et al., 1997; Chakraborti et al., 2015). Cru-cially, all of these examples use the concept of serendipity to denote and designsystems which stimulate the experience of serendipity in their users - what weterm: serendipity as a service. Here, we propose to switch perspectives from“serendipity as a service” to “serendipity in the system,” where artificial systemscan catalyse, evaluate and leverage serendipitous occurrences themselves.

    This perspective shift requires a more nuanced understanding of serendip-ity: for example, consider a reversal of roles in which a person contributes to asystem’s experience of serendipity, in some suitable sense. Here our central goalis to theorise, and indicate in broad terms how to engineer, systems which donot depend on such support by people, but which have the capacity to detect,evaluate and use serendipitous events without user intervention. Why mightsuch features be useful? de la Maza (1994) raised the point: “How disastrous itwould be if a discovery system’s greatest discovery was ‘not noticed’ becausea human did not have the ability to recognise it!”

    Contrary to de la Maza’s hopes, van Andel has suggested that an artificialsystem could never be independent of a person in leveraging serendipity.

    “Like all intuitive operating, pure serendipity is not amenable to gen-eration by a computer. The very moment I can plan or programme

  • Modelling Serendipity in a Computational Context 3

    ‘serendipity’ it cannot be called serendipity anymore. All I can pro-gramme is, that, if the unforeseen happens, the system alerts the userand incites him to observe and act by himself by trying to make a correctabduction of the surprising fact or relation.” (van Andel, 1994, p. 646)

    We fully agree that an artificial system cannot be guaranteed to engage inserendipitous findings, just as a person cannot deliberately force serendipityto happen “on demand.” However, we argue that serendipity can happen in-dependently of human intervention within an artificial system, and that the“serendipity potential ” of such a system can be increased by means of a suitablesystem architecture. In a comparable human context, Louis Pasteur (known forserendipitous discoveries in chemistry and biology (Roberts, 1989; Gaughan,2010)) famously remarked: “Dans les champs de l’observation le hasard ne fa-vorise que les esprits préparés” (“In the fields of observation chance favoursonly prepared minds”) (Pasteur, 1939, p. 131).1 “Preparedness” encompassesvarious ways in which the serendipity potential of a system can be enhanced.

    The framework that we advance was inspired by earlier work of Pease et al.(2013), who explored ways to encourage processes of discovery “in which chanceplays a crucial role” within computational models of creativity. Simonton(2010) had previously drawn relationships between serendipity, creativity, andevolutionary processes. Of particular interest for his analysis were generativeprocesses which are “independent of the environmental conditions of the occa-sion of their occurrence” (Campbell, 1960), including combinatorial as well asrandom processes—a condition understood to imply teleological “blindness.”In a creativity setting, this condition means that one cannot accurately pre-dict the underlying “fitness” of different ideational variants (Simonton, 2010,p. 159). After introducing our model and illustrating it with examples, we ar-gue that the blindness criterion should be relaxed in line with contemporarythinking in cognitive science.

    Corneli and Jordanous (2015) took preliminary steps towards the systemorientation that we will develop here, and also considered how social infrastruc-tures might implement several of the serendipity patterns noted by van Andel(1994). We are aware of recent frameworks designed to help build systems thatsupport the experience of serendipity in their users (Niu and Abbas, 2017;Melo and Carvalhais, 2018): that work testifies to the broader interest thatmodelling serendipity holds within current computing research, but is differ-ent from our present aim.

    We see this work as a contribution to machine discovery, a topic that hasbeen of interest throughout the history of AI research, was highlighted in recentcomputational creativity research events (ICCC’2017 panel on computationaldiscovery), and is increasingly relevant in contemporary applications. 2 Wesituate our research primarily within the field of computational creativity. In

    1Van Andel pointed out (p.c.) that Pasteur’s manuscript actually says “Dans les champsde l’observation, le hasard ne favorise que des esprits préparés” (Bourcier and Van Andel,2011)—“In the fields of observation chance favours only some prepared minds.”

    2Herbert Simon contended that “a large part of the research effort in the domain of‘machine learning’ is really directed at ‘machine discovery’” (Simon, 1983, p. 29).

  • 4 Joseph Corneli et al.

    practical terms, this allows us to engage at a level of abstraction above specificimplementation architectures. After developing a model, we examine severalhistorical systems that illustrate the salience of the model’s features and theviability of their integration and progressive development.

    The current work makes concrete contributions towards the future devel-opment and rigorous analysis of creative systems with serendipity potential:

    – In this introduction, we identify the bias in the existing technical literaturetowards supporting serendipity in the user’s experience, and propose aperspective shift from serendipity as a service to serendipity in the system.We have embraced the concept of serendipity potential in response to aclassic objection to the generation of serendipity by computational means.

    – In Section 2, we draw on a review of prior literature on the concept ofserendipity to juxtapose existing theories and models of serendipity, inorder to summarise the logical structure of serendipitous occurrences. Weunderstand serendipity in terms of discovery, invention and creativity, anddraw connections to the associated literature to create a unified framework.

    – In Section 3, we synthesise a process-oriented model of systems equippedwith serendipity potential which can be used to understand and qualita-tively evaluate the serendipity potential of a system. We provide indicativedefinitions of each of six constituent phases, perception, attention, inter-est, explanation, bridge, and valuation, based on the existing treatment ofthese topics in theoretical literature. We look as well at how people havepreviously approached implementation of the framework’s individual com-ponents, drawing on both classic and contemporary implementation to findheuristics that can support each of the frameworks dimensions.

    – In Section 4, we provide a demonstration of our model by evaluating theserendipity potential of several documented systems developed by others.

    – In Section 5 we evaluate related systems developed by two of the authors,reflecting on how features of our model emerged over time.

    – In Section 6 we discuss in turn: related work; potential directions for furtheruse, development, and formalisation of the model; and the ways in whichthe model may inform future applications.

    – In Section 7 we put forth our conclusion that equipping computationalsystems with serendipity potential would be widely applicable across dif-ferent artificial intelligence applications. We emphasise that our focus ison open discovery, and that the model has particular relevance for futureautonomous systems.

    2 The structure of serendipitous occurrences: a unified frameworkderived from a literature review

    To capture the intricate concept of serendipity in a model that is amenableto computational implementation, we first need a thorough understanding ofthe concept. Our objective in this section is therefore to identify the fac-tors common to existing theories of serendipity in one unified interpretation.

  • Modelling Serendipity in a Computational Context 5

    We will draw on related conceptualisations of creativity, a concept that hasreceived considerable attention in artificial intelligence research (cf. Boden(1998); Colton et al. (2009); McCormack and d’Inverno (2012)).

    At the outset it may be remarked that there are diverse perspectives onserendipity both in the theoretical literature as well as in applied work. Us-age of the term is particularly ambiguous when viewed across different com-putational sub-fields. In the recommender systems context, the dominant,though not exclusive view is that serendipitous recommendations characteriseitems that are both surprising and valuable for the user (Lu et al., 2012;Herlocker et al., 2004). In planning, serendipity is “supposed to be driven byunexpected plan successes, expected but uncertain opportunities, and unex-pected plan failure” (Nelson, 2017), as for example in the onboard planner forNASA’s Deep Space One mission (Muscettola et al., 1997). In their human-robot interaction scenario, Chakraborti et al. (2015) consider serendipity tobe “the occurrence or resolution of facts in the world such that the futureplan of an agent is rendered easier in some measurable sense.” Here, the robotengages in planning in order to help achieve a human-sought goal. However,this understanding appears to conflict with the typical understanding of theconcept of serendipity in a scientific context, as a strictly unplanned discovery(Roberts, 1989). This diversity further motivates a return to the foundationalliterature.

    2.1 Etymology and selected definitions

    The English term “serendipity” derives from Horace Walpole’s interpretationof the first chapter of the 1302 poem Eight Paradises—in a French translationof an intermediate Italian version of the Persian original—written by the Sufipoet Amı̄r Khusrow (van Andel, 1994; Remer, 1965). Related folktales tellsimilar stories (Mazur, 2016, p. 225). The term “serendipity” first appears ina 1757 letter from Walpole to Horace Mann:

    “This discovery is almost of that kind which I call serendipity, a veryexpressive word . . .You will understand it better by the derivation thanby the definition. I once read a silly fairy tale, called The Three Princesof Serendip: as their Highness travelled, they were always making dis-coveries, by accidents & sagacity, of things which they were not in questof [.]” (Walpole, 1937, pp. 407–408)

    Silver (2015) convincingly argues that Walpole appropriated the under-lying concepts from Francis Bacon, who in turn leaned on classical Greekmythology. Following Walpole’s coinage, “serendipity” was mentioned in printonly 135 times over the next 200 years, according to a survey carried out byRobert Merton and Elinor Barber, collected in The Travels and Adventuresof Serendipity (Merton and Barber, 2004). Merton described his own under-standing of a generalised “serendipity pattern” and its constituent parts:

    “The serendipity pattern refers to the fairly common experience of ob-serving an unanticipated, anomalous and strategic datum which

  • 6 Joseph Corneli et al.

    becomes the occasion for developing a new theory or for extending anexisting theory.” (Merton, 1948, p. 506) [emphasis in original]

    In Merton’s account, the unanticipated datum is observed while investigat-ing some unrelated hypothesis; it is a “fortuitous by-product” (ibid.). It isanomalous because it is inconsistent with existing theory or established facts,prompting the investigator to try to unravel the inconsistency. The datum be-comes strategic when the implications of such investigations are seen to suggestnew theories, or extensions of existing theories.

    Roberts (1989, pp. 246–249) records 30 entries for the term “serendipity”from English language dictionaries dating from 1909 to 1989. While classicdefinitions required an accidental discovery, as per Walpole, this criterion wasmodified or omitted later on. Roberts gives the name pseudoserendipity to“sought findings” in which a desired discovery nevertheless follows from an ac-cident. Makri and Blandford (2012a,b) point to a continuum between soughtand unsought findings, and highlight the role of subjectivity both in bring-ing about a serendipitous outcome, and in perceiving a particular sequenceof events to be “serendipitous.” Many of Roberts’ collected definitions treatserendipity as a psychological attribute: a “gift” or “faculty.” Along these lines,Jonathan Zilberg asserts:

    “Chance is an event while serendipity is a capability dependent on bring-ing separate events, causal and non-causal together through an interpre-tive experience put to strategic use.” (Zilberg, 2015, p. 79)

    Numerous historical examples exhibit features of serendipity and involveinterpretive frameworks that are deployed on a social rather than on an in-dividual scale. For instance, between Spencer Silver’s creation of high-tack,low-adhesion glue in 1968, Arthur Fry’s invention of a sticky bookmark in1973, and the eventual launch of the distinctive canary yellow re-stickablenotes in 1980, there were many opportunities for Post-its® to not have cometo be (Flavell-While, 2012). Merton and Barber argue for integrating the psy-chological and sociological perspectives on serendipity:

    “For if chance favours prepared minds, it particularly favours those atwork in microenvironments that make for unanticipated sociocognitiveinteractions between those prepared minds. These may be described asserendipitous sociocognitive microenvironments.” (Merton and Barber,2004, p. 259–260)

    Large-scale scientific and technical projects generally rely on the convergenceof interests of key actors and various other cultural factors. For example, Eco(2013) describes the historical role of serendipitous mistakes, falsehoods, andrumours in the production of knowledge.

    2.2 Theories of serendipity and creativity

    Serendipity is typically discussed in the context of discovery. In everyday par-lance, this concept is often linked with invention or creativity Jordanous and Keller(2016). However, Henri Bergson drew the following distinction:

  • Modelling Serendipity in a Computational Context 7

    “Discovery, or uncovering, has to do with what already exists, actuallyor virtually; it was therefore certain to happen sooner or later. Inven-tion gives being to what did not exist; it might never have happened.”(Bergson, 1946, p. 58)

    We suggest that serendipity should be understood in terms of both discoveryand invention: that is, the discovery of something unexpected in the worldand the invention of an application for the same. Indeed, these terms provideconvenient labels for the two-part model introduced by André et al. (2009a),encompassing the “chance encountering of information” followed by “the sagac-ity to derive insight from the encounter.” McKay (2012) draws on the sameBergsonian distinction to frame her argument about the role of serendipity inartistic practice, where discovery and invention can be seen as ongoing anddiverse. This underscores the relationship between serendipity and creativity.At the same time, looking at Bergson helps to sharpen the challenge faced inany programmatic approach to the subject matter:

    “[A] city can be constructed by photographs taken from every possible an-gle, yet this can never provide the experiential, intuitive value of walkingin the city itself. . . . Within this durational context the free and intuitiveaction ‘drops from [the self ] like an overripe fruit’. This drop may beseen as the moment of recognition within serendipity, involving a coin-cidence of prepared interior capacity with exterior conditions, in otherwords, collaboration between oneself and la durée.” (McKay, 2012, p. 10)

    The tension between programmatic preparedness and in-the-world actionis frequently engaged with in the computational creativity literature; “meregeneration” is typically not deemed to be creative. Whilst the underlyingdefinitions of creativity vary, two standard criteria are variously given as“novelty and utility,” or “originality and effectiveness” (Newell et al. (1963);Boden (2004); Runco and Jaeger (2012)). With a somewhat different empha-sis, Cropley (2006) draws on Austin (1978) to infuse his concept of creativitywith features of chance, and understands a creative individual to be someonewho “stumbles upon something novel and effective when not looking for it.”However, Cropley questions “whether it is a matter of luck,” because of thework and knowledge involved in the process of forming an assessment of one’sfindings. Campbell (1960) argues that “all processes leading to expansions ofknowledge involve a blind variation-and-selective-retention process.” However,Austin (1978, p. 49) remarks that: “Nothing [suggests that] you can blunderalong to a fruitful conclusion, pushed there solely by external events.”

    Csíkszentmihályi describes creativity similarly to Merton’s unanticipated,anomalous and strategic datum, as it arises and develops in a social context.

    “[C]reativity results from the interaction of a system composed of threeelements: a culture that contains symbolic rules, a person who bringsnovelty into the symbolic domain, and a field of experts who recognizeand validate the innovation.” (Csíkszentmihályi, 1997, p. 6) [emphasisadded]

  • 8 Joseph Corneli et al.

    In this case, novelty is attributed to “a person”: even so, it is reasonable toassume that this person’s novel insights rely at least in part on the observa-tion of data. Csíkszentmihályi’s three-part model of the creative process canbe compared with his five-part phased model, comprising preparation, incuba-tion, insight, evaluation, and elaboration (Csíkszentmihályi (1997, pp. 79–80),adapting Wallas (1926)). Campos and Figueiredo (2002) use this later modelto describe instances of serendipitous creativity.

    The more elaborate model is also a near match to the process-based modelof serendipity from Lawley and Tompkins (2008), centred on a sequence ofcomponent-steps: prepared mind, unexpected event, recognise potential, seizethe moment, amplify effects, and evaluate effects. However, Lawley and Tomp-kins’s model includes a feedback loop between “recognising potential” and“evaluating effects” that has no parallel in the Wallas/Csíkszentmihályi model.Moreover, they remark:

    “[S]ometimes the process involves further potentially serendipitousevents [a]nd sometimes it further prepares the mind (at which timelearning can [be] said to have taken place)” (Lawley and Tompkins,2008)

    Makri and Blandford (2012a) propose a model that adapts Lawley andTompkins, notably by combining the “prepared mind” and “unexpected event”into one first step, a new connection, which involves a “mix of unexpected cir-cumstances and insight.” Expanding on the notion of a feedback loop, theysuggest that a parallel process of reflection into the “unexpectedness of cir-cumstances that led to the connection and/or the role of insight in making theconnection” is important for the subjective identification of serendipity. Pro-jections of value can be updated when the new connection is exploited—forexample, when it is discussed with others.

    Allen et al. (2013) studied how the term serendipity and its various syn-onyms and related terms have been used to describe opportunistic discovery inthe biomedical literature. Three categories of usage were particularly salient:inspiration, mentioned findings, and research focus. These categories of usageroughly parallel Merton’s serendipity pattern and Csíkszentmihályi’s three-part creativity framework. A fourth category, systematic review, highlightedscholarly interest in the topic of serendipity itself. On this note, Björneborn(2017) surveys several theoretical treatments beyond those mentioned above,and extracts diverse personal and environmental factors that can promoteserendipity. We will engage with his work later on, but for now, we haveenough material to assemble themes in line with our objective.

    2.3 Distilling the literature into a framework

    The different treatments of serendipity in many cases appear to build on oneanother, and in all cases appear to be roughly aligned. Accordingly we candistil the foregoing survey into a framework that describes serendipitous phe-nomena in terms of six phases: perception, attention, focus shift, explanation,

  • Modelling Serendipity in a Computational Context 9

    Serendipity is· · ·

    discovery invention (1)

    chance encountering of information sagacity to derive insight (2)

    symbolic rules(that do not directly

    account fornewly-encountered

    data)

    novelty validation (3)

    findings inspiration research focus (4)

    unanticipateddatum

    anomalousdatum strategic datum

    new or modifiedtheory (5)

    preparation(including

    observations)incubation insight evaluation elaboration (6)

    preparedmind

    unexpectedevent

    recognisepotential

    seize themoment

    amplifyeffects

    evaluateeffects (7)

    new connectionprojectvalue

    exploitconnection

    valuableoutcome

    reflect onvalue (8)

    perceptionof a

    chance event

    attentionto salient

    detail

    focus shiftachieved

    by interest

    explanationof theevent

    bridgeto a

    problem

    valuationof theresult

    (9)

    ︸ ︷︷ ︸

    All of which are operations of a prepared mind subject to chance.

    Table 1: Aligning ideas from several theories of serendipity and creativity.Rows 1-7 show increasing detail, moving from two to six phases; row 8 bundlestwo of the steps together; row 9 summarises our analysis and provides theframework for Section 3. Sources: (1) Bergson (1946); (2) André et al. (2009a);(3) Csíkszentmihályi (1997); (4) Allen et al. (2013); (5) Merton (1948); (6)Wallas (1926) (as adapted by Csíkszentmihályi); (7) Lawley and Tompkins(2008); (8) Makri and Blandford (2012a).

    bridge, and valuation. Table 1 shows graphically how we have drawn out theseconcepts. In the following paragraphs, we trace through the rows of Table 1line by line, resummarising earlier perspectives on serendipity and drawingconnections between these earlier theories and our framework. Here we useboldface to distinguish elements of earlier theories, and italics to distinguishelements of our framework.

    (1) From Bergson (1946): we take the notion of discovery to entail percep-tion and attention, which can potentially lead to a focus shift. In cases ofserendipity, we understand invention to build on a discovery, through thegeneration of a novel explanation and a bridge to a newly identify problemthat the explanation solves. The solution is then evaluated positively.

    (2) From André et al. (2009a): chance encountering of information ex-plicitly indicates perception of a chance event. We take attention to beimplicit. We understand the phrase sagacity to derive insight to encap-sulate what we mean by focus shift, explanation, bridge, and valuation.

  • 10 Joseph Corneli et al.

    (3) From Csíkszentmihályi (1997): the three-part model of creativity concernsinteractions between a Domain, a Field, and an Individual (often collec-tively abbreviated as “DFI”). In cases of serendipitous creativity, the fol-lowing occurs. A chance event is perceived that cannot be fully explainedwhen attended to through the rubric of known symbolic rules which com-prise a specific cultural Domain. A creative Individual is then inspired bythe event’s novelty to achieve a focus shift, namely, to examine the un-explained details and generate an—a fortiori also-novel—explanation ofthe event. Finally, their finding is validated by a Field of experts whenthe explanation can be bridged to some (new or existing) problem that itsolves, in which case the process is deemed creative, and given a positiveevaluation.

    (4) From Allen et al. (2013): the category of mentioned findings suggestsperception of a chance event and attention to salient detail ; their categoryinspiration suggests a potential focus shift leading to an effort to explainthe event with a research design that explores the serendipitous inspira-tion; their category research focus focuses on better understanding a“fortuitous discovery” or “unanticipated finding” to establish a bridge to aproblem that the discovery solves, towards evaluating the result.

    (5) From Merton (1948): the observation of an unanticipated datum alignswith perception of a chance event that captures our attention: it is a “for-tuitous” discovery (p. 506). Subsequent interest in the anomalous natureof the datum causes a focus shift towards a strategic explanation of theanomaly, leading to the bridge from the anomalous detail to new theoreticalinsights. The new (or extended) theory initiated by these investigationsreceives an at least preliminarily positive valuation.

    (6) From Wallas (1926): preparations (among with we include observations)afford the perception of a chance event. Note that such preparations arerelevant both to observing the event, and to recognising it as unexpected.During a period of incubation, the perceiver’s attention may be turnedtowards salient details that can lead to an insight which then leads to anexplanation of the event. Here we run into some terminological collisions.What we call the bridge to a problem could be linked to the insight stage,but we may also think of it as rather close to what Wallace calls evalua-tion, insofar as the problem that is identified at this stage is what makesthe insight useful. In the phase of elaboration (introduced by Csíkszent-mihályi) the finding undergoes further evaluation in new contexts.

    (7) From Lawley and Tompkins (2008): the prepared mind is relied upon atseveral stages in the process; indeed, as we described above, we see theprepared mind as vitally active throughout. In the first instance, we canconnect it with these authors’ usage of the term “perception.” As we notedearlier with reference to Clark’s theory of predictive processing, the mind’sprevious preparations are what make the unexpected event unexpected.Previous preparations can either prevent or allow recognising potentialin a given observation, in part because these preparations constrain howand whether the individual pays attention to the event, and whether or

  • Modelling Serendipity in a Computational Context 11

    not they achieve a focus shift. Only when the aforementioned steps haveoccurred might the person seize the moment to form a contextual expla-nation of the event; and amplify effects by finding a bridge to a problemthat the explanation can solve. Once all of this is done, then the agent mayevaluate effects. Note the role for a prepared mind in our sense—as ac-tive throughout the process—in supporting the “iterative circularity” thatLawley and Tompkins say may motivate several passes of recursion overthe steps between evaluating effects and recognising potential, as well asthe role of chance in producing opportunities to learn.

    (8) Makri and Blandford (2012a) follow Lawley and Tompkins in includingfeedback loops explicitly in their model. Their model posits a new con-nection to be formed by the perception of a chance event and attention tosalient detail which then leads the potential experiencer of serendipity toproject value. This subsequently leads to a focus shift when the individ-ual in question exploits the new connection. We assume this is donein a somewhat explicable or predictable way. Makri and Blandford assertthat this itself is already a valuable outcome, i.e., it solves some problemdirectly; by reflecting on its value the agent may bridge to a (further)problem. An interesting aspect of the Makri and Blandford model is thatvaluation is somewhat ongoing and reflecting on value may feed back intothe earlier part of the process that projected value, leading to renewedinterest. As the process iterates, additional bridges to new problems arecreated, or some particular problem is understood in more detail.

    2.4 Summary

    Our review of significant literature on serendipity leads us to key features ofsystem operation that can be described as serendipitous. Underpinning ouranalysis are foundations based on the roles of chance and the prepared mind.Highlights are summarised in Table 1; terms in the table are explained inthe above sections. Building on the literature surveyed above, we describeserendipity as a form of creativity that happens in context, on the fly, withthe active participation of a creative agent, but not entirely within that agent’scontrol.

    While the various theories we have examined differ from one another asto where “insight” takes place in the process—and some do not mention thisterm—none of them seems to endorse a theory of uninsightful serendipity.Nevertheless, Copeland (2017) has argued that “the insight of the individual isinsufficient for bringing about a serendipitous, scientific discovery,” and makesa case for an understanding of serendipity that “goes beyond the cognitive.”We agree with Copeland that a contextual perspective is necessary, and wewill return to this theme in what follows: nevertheless an agent (or agency, perMinsky (1988)) that experiences serendipity is also necessary, and a naturalplace to begin modelling work.

  • 12 Joseph Corneli et al.

    3 A computational model and evaluation framework for assessingthe potential for serendipity in computational systems

    This section develops cognitively and computationally realistic definitions foreach of the six concepts from our synthesis of theories in Section 2. We beginin Section 3.1 with a high-level schematic diagram that shows how the sixphases might in principle be manifested together in a computational system. Todemonstrate that the schematic realistically captures common understandingsof serendipity, we use it to redescribe a famous historical case of serendipity:the invention of PostIt® Notes at 3M. This prepares the ground for Section 3.2,where we present informally-stated but practically-inspired definitions of eachof the six terms. We support the definitions with existing foundational theoriesfrom philosophy and cognitive science, and, for each, outline a set of heuristicsto inform future implementation work, inspired by existing implementations.

    It is important to note that each of the six phases in the model has a widehorizon, often encompassing both good-old-fashioned AI and contemporaryapproaches.3 We must therefore be selective rather than comprehensive in ourapproach to the literature, towards our overall aim show that how computationmight be employed to produce serendipitous results. Section 4 will then usethis model to comprehensively assess the potential for serendipity in discreteimplemented systems, particularly for computational creativity.

    3.1 A process model and rational reconstruction of a historical case study

    Figure 1 places the six phases discussed above into a diagram outlining theidealised implementation of a (potentially) serendipitous system. Some stepsare expanded in more detail than others. Other architectures might foregrounddifferent kinds of feedback between the main steps, but to keep things sim-ple we have not shown all possible ways in which the process might revisitearlier steps as it runs. We illustrate how the diagram works in a rationalreconstruction of the invention of Post-Its® at 3M (quotes below are fromFry and Silver (2010)). The level of detail and specificity is intermediate be-tween the abstract overview from the previous section and the definitions andheuristics that will be advanced in Section 3.2. Before developing definitions ofthe individual components, it is useful to have an example that puts the wholeprocess together, i.e, making the interconnections between the phases explicit.One immediate challenge arises in building a rational reconstruction of thePost-Its® example: the story includes several steps that could informally becalled “serendipitous” in light of the success that follows. Our reconstructionis focused by this aim: to illustrate how a modular architecture like the oneillustrated can create serendipitous results—in this case, using a social ratherthan computational infrastructure.

    3For example, “Machine Perception and Artificial Intelligence” is the title of a bookseries published by World Scientific that began in 1992 and currently contains 83 volumes:https://www.worldscientific.com/series/smpai .

    https://www.worldscientific.com/series/smpai

  • Modelling Serendipity in a Computational Context 13

    Discovery:generativeprocess E

    perception

    feedback

    reflectiveprocess

    p1p2

    attention

    TT ⋆ «

    interest[Focus shift]

    Invention: verification

    experimentalprocess

    p′1

    p′2

    explanation

    Mcreativeprocess

    bridge

    P

    evaluationprocess

    valuation

    ...

    Fig. 1: A boxes-and-arrows diagram, showing one possible process model ca-pable of producing serendipitous results.

    Perception of a chance event The first module is a generative process. In animplementation, this may be based on direct observations of the world and/orsystem-internal sources of chance such as a random number generator. Theoutput of the generative module is understood as a chance event, E, that hasbeen perceived by the system. It is passed to the next stage.

    Example In the 3M case study the event of interest was generated by SpencerSilver’s work in a team carrying out research on “pressure-sensitive adhesives.”

    Spencer Silver: “As part of an experiment, I added more than the rec-ommended amount of the chemical reactant that causes the moleculesto polymerise. The result was quite astonishing. Instead of dissolving,the small particles that were produced dispersed in solvents. That wasreally novel and I began experimenting further. Eventually, I developedan adhesive that had high ‘tack’ but low ‘peel’ and was reusable.”

    Here we take E to include not only the bare fact of the adhesive’s creation, butalso Silver’s preliminary assessment. Simply put, the new high-tack, low-peel,adhesive would not have been created had the reaction not captured Silver’sattention and interest. However, we certainly cannot explain the serendipitousinvention of Post-Its® with reference to these acts alone. With regard to socialinfrastructures, as Minsky (1988, p. 23) remarked, “It is not enough to explainonly what each separate agent does. We must also understand how those partsare interrelated—that is, how groups of agents can accomplish things.”

  • 14 Joseph Corneli et al.

    Attention to salient detail In this stage certain aspects of E will be marked upas being of potential interest, leading to T in the figure. This designation doesnot in general arise all at once. T is considered to be the result of feedback, anabstraction over a more complex reflective process. In Figure 1, the reflectiveprocess makes use of two primary functions: p1 notices particular aspects ofE, and another, p2, applies processing power and background knowledge toenrich E with additional information. We could call p1 awareness, and p2concentration. There may be several rounds of feedback applied (recursively)in order to construct T . Looking ahead to the next phase, T will serve totrigger subsequent interest: but notice that the system is explicitly involved increating T , which does not simply arrive wholly formed. Nevertheless, at thisstage there is little direct evidence of how it will be used later.

    Example In the 3M case study the key aspects of the reflective process wereimplemented by Silver (who spread awareness of the new adhesive) togetherwith other employees (who developed a prototype product and gave the topicfurther concentration).

    Spencer Silver: “[T]he company developed a bulletin board that remainedpermanently tacky so that notes could be stuck and removed. But I wasfrustrated. I felt my adhesive was so obviously unique that I began togive seminars throughout 3M in the hope I would spark an idea amongits product developers.”

    Art Fry: “I was at the second hole on the golf course, talking to thefellow next to me from the research department when he told me aboutSpencer Silver, a chemist who had developed an interesting adhesive. Idecided to go to one of Spencer’s seminars to learn more. I worked inthe Tape Division Lab, where my job was to identify new products andbuild those ideas into businesses. I listened to the seminar and filed itaway in my head.”

    Focus shift achieved through interest The trigger T thus consists of the originalevent, E, together with a range of newly-added metadata and markup. A focusshift selects («) some elements from this complex object, potentially usingthem to retrieve additional data. The result is “of interest,” denoted above byT ⋆.

    Example In the 3M case study, the information that Fry had filed away before(T ) became interesting when he realised that he “had a [related] practicalproblem” (T ⋆).

    Art Fry: “I used to sing in a church choir and my bookmark wouldalways fall out, making me lose my place. I needed one that would stickbut not so hard that it would damage the book. The next morning, Iwent to find Spencer and got a sample of his adhesive.”

  • Modelling Serendipity in a Computational Context 15

    In this case, the adhesive becomes interesting insofar as it could potentiallyused to create a re-stickable bookmark. 3M allowed its employees to selectivelyallocate 15% of their time (Flavell-While, 2012), and Fry decides to initiatehis own experiments.

    Explanation of the event The now-interesting trigger T ⋆ is submitted for veri-fication, which Figure 1 depicts as an abstraction over an experimental process,whose operations here again consist of two primary functions: theory genera-tion, p′

    1, and theory checking, p′

    2. The result of this process is a model, M . The

    dashed arrow in the diagram is meant to indicate that the focus shift stage maybe revisited and new selections made as this process progresses, i.e., T maybecome interesting for new or different reasons as the experimental processprogresses.

    Example In the 3M case study, Fry already has in mind the theory (p′1) that

    re-stickable bookmarks can be made using the new adhesive. Fry creates andadjusts a working prototype (p′

    2) on the way to verifying his theory.

    Art Fry: “I made a bookmark and tried it out at choir practice; it didn’ttear the pages but it left behind some adhesive. I needed to find a wayto keep the particles of the adhesive anchored to the bookmark. After afew experiments, I made a bookmark that didn’t leave residue and testedit out on people in the company.”

    Note that in this case the event E has not been explained in terms of “how”but rather, contextually, in terms of “so what?” The nature of the explanationwill differ from case to case. The common feature is the creation of a causalmodel of some sort. In this case, the causal model M is a method for creatingre-stickable bookmarks that don’t leave residue.

    Bridge to a problem Here the system forms a connection (“bridge”) betweenthe explanation in the form M and some as-yet-unspecified problem, P . Theschematic represents this step in one block, a creative process. This is clearlyunderspecified, but we shall describe different possible implementation strate-gies shortly, in Section 3.2.

    Example Let’s see how this process worked in the 3M case study. Fry nowhad a prototype, but so far it didn’t solve a very interesting problem. (“Theyliked the product, but they weren’t using them up very fast.”) But then:

    Art Fry: “[O]ne day, I was writing a report and I cut out a bit of book-mark, wrote a question on it and stuck it on the front. My supervisorwrote his answer on the same paper, stuck it back on the front, andreturned it to me. It was a eureka, head-flapping moment – I can stillfeel the excitement. I had my product: a sticky note.”

    It would seem that no one, including Fry, had thought about this problembefore: how can we easily make notes on a document, without marking up the

  • 16 Joseph Corneli et al.

    document itself, and without introducing other separate sheets of paper thatwould need to be stapled or paper-clipped to the document, or that might getlost?

    Indeed, without knowing the solution in advance, or having M in mindand re-stickable bookmarks to hand, the problem might even sound like acontradiction in terms. It would probably have been impossible to solve it verywell using conventional methods (Altshuller, 2007, p. 90). But remember thatFry was part of the Tape Division. By cutting off a piece of the bookmark, andaffixing it to the front of the report, he was using the bookmark like one mighthave used a piece of tape—which would have been another semi-conventionalsolution, different from staples and paperclips, for affixing a separate sheetof paper. However, the new “sticky note” had several advantages over tape: itcould be written on directly and easily removed later. Thus, we may rationallyreconstruct the bridge to P via an intermediate virtual solution of a note tapedto the report’s cover.

    Valuation of the result The new problem, P , which now conveniently has asolution in the form of M , is passed to an evaluation process, and, from there,to further applications. One possible class of applications would be a change toany of the modules that participated in the workflow, corresponding to the po-tential for learning from serendipitous events noted by Lawley and Tompkins(2008).

    Example The 3M example shows that evaluation can itself be a complexprocess:

    Art Fry: “We made samples to test out on the company and the resultswere dramatic. We had executives walking through knee-deep snow toget a replacement pad. It was going to be bigger than Magic Tape, mydivision’s biggest seller. In 1977, we launched Post-it Notes in fourcities. The results were disappointing and we realised we needed samples.People had to see how useful they were. Our first samples were givenout in Boise, Idaho and feedback was 95 per cent intent to re-purchase.The Post-it Note was born.”

    Notice that in this case the approach to valuation is itself updated on the fly.

    3.2 Definitions of the model’s component terms

    We now present short definitions of each component, which we support withreferences to foundational literature from cognitive science and philosophy, aswell as heuristics that relate to the current status of implementation work asevidenced by computing literature. Our thinking in this section is informedby the “predictive processing” framework advocated for example by Friston(2009), Clark (2013), and others. A central idea in such theories is that per-ceived events are only passed forward to higher cognitive layers if they do not

  • Modelling Serendipity in a Computational Context 17

    conform with our prior expectations. This perspective highlights the fact that,going beyond Pasteur’s famous idiom, chance not only favours, but also shapesthe prepared mind. Thus, for example, Boden (2004, p. 137) notes that “neu-ral networks learn to associate (combine) patterns without being explicitlyprogrammed in respect of those patterns.”

    Multi-level architectures abound in AI; one example comes from Singh and Minsky(2005), where the first level beyond “innate reactions” is “learned reactions”;higher levels include “deliberative thinking”, “reflective thinking”, “self-reflectivethinking” and “self-conscious thinking.” Sloman and Scheutz (2002) place some-what similar concepts in a two-dimensional schema which they suggest can beused to compare different architectures. While sharing concepts of hierarchicalcontrol with such models, theories based on predictive processing “upend” theclassical input/output paradigm, recentring on thermodynamic energy trans-fer: their models of control are continuous and “there are no disconnectedmoments of perception of the world, since the world wholly envelops the agentthroughout its lifespan” (Linson et al., 2018, pp. 9–10). Kockelman (2011) de-velops a related line of thinking from a semiotic perspective, pointing outthat processes of “sieving” and “selection” are not just properties of the mindbut also of the environment. Upon considering these reflections, we cannotsubscribe to the view that serendipity is “a process of discovering with a com-pletely open mind” (Darbellay et al., 2014). The mind will in general havebeen shaped by previous interactions with the world.

    Furthermore, while we necessarily must present the phases of our model inorder, we hereby make explicit the assumption that phases encountered ear-lier can be returned to from temporally-later ones. Because the phases buildon one another, we propose that they must be encountered in temporal or-der, backward-directed moves notwithstanding. In other words, we allow theprocess to jump backward, and only jump forward to steps that have beenencountered already. This does not imply that future steps are always entirelyimpossible to anticipate, however. Thus, for example, Pasteur’s research hasbeen described as “use-inspired” (Stokes, 1997). Some famous pseudoserendip-itous discoveries, such as the treatment of disease with safe antibiotics, werepursued in broad outline long before the details became clear (Fleming, 1964).

    In this respect we note that Friston’s model of predictive processing makesmore specific and detailed assumptions about structure and interconnectionthan we will adhere to here, namely that “error-units receive messages from thestates in the same level and the level above; whereas state-units are driven byerror-units in the same level and the level below” (Friston, 2009, p. 297). In sim-pler biologically-inspired terms, “the brain generates top-down predictions thatare matched bottom-up with sensory information” (Bruineberg et al., 2018,p. 2). The mismatch between sense data and existing ubiquitously generativemodels is how prediction errors are said to arise, which the system then strivesto correct. Here, the simpler account of interconnections between the modulesthat we developed in Section 3.1 guides our work. Our model also has integralgenerative aspects, but they differ at the different phases.

  • 18 Joseph Corneli et al.

    To emphasise, our intention in this section is to give a plausible generalaccount of the six phases from which our model is comprised: we offer a top-down analysis. Accordingly, we do not give exhaustive technical definitions, nordo we make detailed assumptions about the overall architecture. The heuris-tics are intended to present practical advice that could be used to increase asystem’s serendipity potential with respect to each phase.

    Primitive Notions We cast our definitions in terms of the primitive notionsof “system”, “event”, “context,” and “object” which for conciseness we describeby way of examples.

    – By system here mean a computational system. Extant examples includesystems which generate music, art or poetry; make discoveries in drug de-sign, generate scientific discoveries or prove mathematical theorems; carryout predictive modelling such as classifying an email as spam or not; per-sonal conversational assistants; systems which play games such as Go orChess, and so on. The system has the ability to process data and makeevaluative decisions.

    – By event we mean some form of input or generated data. Examples couldinclude partial fragments of music, art or poetry which have been inputor generated; external events such as new data in a dynamic world, a newclassification weighting, an email to classify, a conversational turn, a pieceof information about the weather, a Chess move, and so on.

    – By context we mean a specific set of events, data, algorithms, generative andevaluative mechanisms, search strategies, etc., that a system can access,and which may be related to a current goal or problem. An example ofa context is a particular sequence of questions asked to a conversationalagent, possible answers and their sources, a way of ranking possible answersand any other information or techniques necessary to produce an answer.

    – An object is an element of a context.

    Definition 1. Perception: The processing of events that arise at least par-tially as the result of factors outside of the system’s control.

    FoundationsSystem-environment relationships differ widely, and develop differently. Theenvironment may be more or less observable; events may appear to be moredeterministic or more stochastic in nature (Russell and Norvig, 2003, pp. 42–44). The system may be able to self-program using the environment, possiblyvia interaction with other systems (Clark, 1998, esp. p. 234). The system’sperceptual features and limitations can vary with time, location, the state ofdevelopment of the system, and other factors.

    Chance can play various roles in shaping perception. For Hume (1904, p. 99)chance denotes the absence of an explanation; for Peirce (1931) it is one ofseveral fundamental aspects of reality; for Bergson (1911, p. 234), it “objectifiesthe state of mind” of one whose expectations are confounded. Unexpectedevents constitute novel perceptions, and can motivate action that leads to

  • Modelling Serendipity in a Computational Context 19

    further perceptions.

    The system has limited control. The world is not entirely under the controlof the system: furthermore, perceptions necessarily constitute an incompletepicture of reality Hoffman et al. (2015). As in Figure 1 and its accompanyingdiscussion, our model allows events to arise through generative methods, butthis again implies a circumscribed locus of control, namely, over the generativeprocess, but necessarily over the results.4 Taking a view grounded in predictiveprocessing, Linson et al. (2018, pp. 2, 17–18) emphasise the epistemic andexistential salience of generative models (and continuous action/perceptionloops, including proprio- and intero-ception) for both organisms and futurerobots. The basic view is that “we harvest sensory signals that we can predict”(Friston, 2009), though such predictions are fallible.

    HeuristicsTo create the possibility for varied patterns of inference to arise, support richinterfaces. Computer support for natural language interaction remains lim-ited. Human-Computer Interaction researchers have experimented with a widerange of alternative interface designs (e.g., ranging from head tracking and ges-ture tracking (Turk, 2000) to interaction through dance (Jacob and Magerko,2015) and with physical models (Stopher and Smith, 2017)).

    To reduce constraints, allow features to be defined inductively. Rather thanbuilding systems that simply notice pre-conceived features of the environment,recent research has dealt with systems that independently discover perceptiblefeatures (Mordvintsev et al., 2015).

    Organise and process perceptions differently depending on the tasks under-taken. Humans have head direction and grid cells that help define our rela-tionship to the environment, and that support spatial navigation tasks. Similarphenomena have been reproduced in machine learning programs (Banino et al.,2018; Cueva and Wei, 2018). However, AI systems often operate in environ-ments that are structured very differently from their human analogues, e.g.,when machine learning is applied to text corpora. Rather than adjustingthe underlying source of perceptions, it may be be preferable to build con-straints on action that give an “explicit characterization of acceptable behav-ior” (Caliskan et al., 2017, p. 356) within the environment.

    Definition 2. Attention: The system’s directed processing power applied toa perceived event, which is accompanied by an initial evaluation.

    FoundationsAdaptive attention is related to surprise. According to Clark (2013), an eventonly draws attention when the perceiving agent did not anticipate it.

    4For example, there is a difference between generating elements in a sequence of 1’s,the elements of which are predictable a priori, and generating additional digits of the deci-mal expansion of π, which should be replicable a posteriori but which in practice involvesnontrivial computation.

  • 20 Joseph Corneli et al.

    Learning, context, and meaning begin to arise together with attention. “Punc-tuating events” (Bateson, 1972, p. 301) from a stream of data is a basic formof attention. Identifying patterns that are stable over time, which then begingive the data “context and interpretation” (Rowley, 2007) is another.

    To some approximation, features of the environment will be attended to. Thisis a version of the hypothesis that hierarchical structures in the environmentwill be mirrored by adaptive agents (Simon (1962, 1995)). Outside interven-tion may be needed to optimise learning about tasks with complicated prob-lem/subproblem structure (Goldenberg et al., 2004).

    HeuristicsAttention can be understood as competition for scarce processing resources. Forexample, visual attention has been described this way (Helgason and Thórisson,2012), and parallels can be seen in grammar-inducing processes (Wolff, 1988).Taken as a metaphor, this extends to “the mental grammar of the investigator”and the way they “parse their conceptual domain” (Dixon, 2004).

    Attention can be time-delineated. In his design of the discovery system AM,Doug Lenat assigned “a small interestingness bonus” (Lenat and Brown, 1984,p. 281) to each new concept the system created. The bonus decayed rapidlywith each new task undertaken, but in the mean time, it made the new conceptmore likely to be used. This was inspired by a similar but more complex “Focusof Attention” facility in the blackboard system Hearsay-II (Lesser and Erman,1977).

    Competition may be less natural when we can take advantage of parallelism.Humans have the ability to process complex activities in parallel (Blackmore,2005, pp. 40–42); as we saw in Section 3.1, social infrastructures can distributefeatures of attention, such as awareness and concentration, across a population.Joint attention is one such important social phenomenon. In related compu-tational work Zhuang et al. (2017) describe a system for parallel attentionthat recurrently identifies objects in images. It makes use of both image-levelattention and text-based proposals (the latter directed to image regions), al-lowing image contents to be identified in a dialogue format. Xu et al. (2015)also worked on image captioning, this time using a long short-term mem-ory (LSTM) network that independently selected image regions. LSTMs aredetailed computational models of neurons that are capable of learning long-term dependencies. Xu et al trained their networks using models of “soft” and“hard” attention: the latter did somewhat better for the metrics considered.For a navigation task, Vemula et al. (2017) had success using “soft attentionover all humans in the crowd,” i.e., not simply those people who are nearest.

    Definition 3. Focus Shift: A reassessment in which an object that hadbeen given a neutral, or even negative value, becomes more interesting. Thismay happen, for instance, if a change of context means that a previouslyencountered object is now considered to be relevant.

    Given the central nature of the focus shift in our model, we expand its pre-conditions in more detail.

  • Modelling Serendipity in a Computational Context 21

    Definition 3A. Ability to Focus Shift: Let E(o, c) be the evaluation per-formed by the system according to a set of evaluation criteria of a given objecto in a given context c. A focus shift occurs when, for object o and context c1,E(o, c1) ≤ θ for a given threshold θ; and the system either:

    (i) retrieves an existing context c2 such that E(o, c2) > θ(ii) generates a new context c2 such that E(o, c2) > θ, or(iii) changes its evaluation criteria to E′ such that E′(o, c1) > θ

    (Or some combination of (i) - (iii).) A system that can perform one or moreof these operations has the ability to focus shift.

    FoundationsAssess the data’s potential for strategic usefulness. In evolutionary computing,fitness is typically an attribute of an agent, often modelled as a scalar value.Now, instead, we might understand the agent’s objective functions to give riseto a fitness landscape that can drive transformation of the data the systemencounters, or cause it to be cast aside. Simonton makes use of a somewhatrelated concept of fitness, distinguishing between blind and sighted selection(Simonton, 2010, p. 159): he proposes a fitness measure for selected itemswhich is understood as a measure of their utility for the agent (which is whatwhat the agent or may not may be blind to). Definition 3A makes no assump-tions about the actual utility of selected items.

    Interest is related to curiosity. Berlyne distinguished between perceptual andepistemic curiosity, while positing a relationship between them: one “leads toincreased perception of stimuli” and the other to “knowledge” (Berlyne, 1954,p. 180). He posited that responses would be strongest in an “intermediatestate of familiarity” which triggered conflict, whereas “too much familiarity willhave removed conflict by making the particular combination an expected one”(p. 189). Accordingly, such curiosity depends on prior preparations. In somereinforcement learning models, a novelty bonus “acts like a surrogate reward”and “distorts the landscape of predictions and actions, as states predictive of fu-ture novelty come to be treated as if they are rewarding” (Kakade and Dayan,2002, p. 554). Whether or not novelty is interesting in and of itself, the sys-tem’s initial assessment motivates it to look for further information or “newconnections,” as per Makri and Blandford (2012a). This effort is expected toyield a future payoff, whether in terms of additional novelty, more efficientorganisation of the system’s knowledge base, or in some other way. Crucially,interest is not related exclusively to curiosity, but to a whole set of intrinsicmotivations.

    Context change is a possible basis for belief revision. Logan et al. (1994) usethe notion of belief revision to model situations of collaborative information-seeking. Ground assumptions are shared in the context of such dialogues, andcan change as conversations progress. In our model, the focus shift similarlycauses the context to change, so that the ground assumptions, including waysof evaluating data, are no longer the same. Harman (1986) treated the impli-cations of changing circumstances for bringing about a “reasoned change of

  • 22 Joseph Corneli et al.

    view” (p. 3); he described previous work by Doyle (1980) on the system SEAN,which incorporated defeasible reasoning, as one of only a few earlier efforts inthis area. More recently, Clarke (2017) argues that belief is context-sensitive,depending for example on purpose, and on the stakes involved. Thus, in adialogue, the sincerety of a given remark is linked to the context, not just tothe remark’s propositional content.

    HeuristicsInterest can be linked to novelty in order to inspire learning. In the case ofVelcro™, the focus shift occurred in quite a literal fashion, when de Mestralexamined burrs under a microscope. This example provides another usefulmnemonic: burrs’ hooks allow them to “hitchhike” into new contexts (Jenkins,2011, §1.1). Patalano et al. (1993) describe the related mental phenomenon ofpredictive encodings that record “blocked goals in memory in such a way thatthey will be recalled by conditions favorable for their solution.” The CuriousDesign Agents developed by Saunders (2007) evolve artworks in respect to a so-phisticated measure of interestingness. These agents cluster artworks together,and assess the novelty of new inputs by means of classification error. They thendetermine a new artwork’s interestingness by mapping its novelty to an inverse-U-shaped curve, inspired by the Wundt curve (cf. Berlyne (2013, pp. 17–19)).This model is useful for “modelling autonomous creative behaviour” and can“promote life-long learning in novel environments” (Saunders et al., 2010). Asimilar conception of interest is has been applied to “generate art with increasedlevels of arousal potential in a constrained way without activating the aversionsystem,” using a variant of Generative Adversarial Networks to motivate thecreation of visual artworks that exhibit “stylistic ambiguity” (Elgammal et al.,2017, p. 97). Mathematicians, such as Birkhoff (1933), have proposed manymathematical theories of aesthetics, though philosophers have just as oftenrefuted them (Hyman, 2006, p. 4). In Jürgen Schmidhuber’s work, interesting-ness is positioned as the “first derivative of subjective beauty” (Schmidhuber,2009)—where beauty is understood as compressibility. Here, phenomena thatmaximise prediction error drive curiosity. Javaheri Javid et al. (2016) applyrelated measures of information gain and Komolgorov complexity to evaluateand drive the evolution of 2D patterns generated by cellular automata.

    Interest can be linked to aesthetics in order to capture varied notions of fitness.Dhar et al. (2011) describe an “aesthetics classifier” that can determine thepotential interestingness of images in terms of high level content and composi-tional attributes such as “people present”, “opposing colors”, and “follows ruleof thirds.” Wang et al. (2018) applied machine learning to a corpus of digialphotographs with ratings and reviews, and generated new textual descrip-tions and rating predictions based on the crowdsourced descriptors. DARCI(short for Digital ARtist Communicating Intention) is a generative programwhich similarly links crowdsourced image descriptions to extracted features(Norton et al., 2013). It evolves input images using a fitness function that op-timises for a combination of appreciation, defined in terms of describability,and interest, which is, as above, an inverse-U-shaped measure of similarity to

  • Modelling Serendipity in a Computational Context 23

    the input image.

    Beauty is in the eye of the beholder. Corneli and Winterstein (2016) followWaugh (1980) in describing complexity and coherence as two key aspectsof poetic beauty. With regard to their implemented system that generateslinked verse: “A reader may identify some fortuitous resonances [in the system-generated poems] but the system itself does not yet recognise these features.”Veale (2015) discusses a related placebo effect among readers of computer-generated tweets, and the broader role that “an active and receptive mind”plays in our interactions with the world.

    Definition 4. Explanation: This is a model that predicts functional, oper-ational, or statistical behaviours that relate the previously-unexpected eventto its newly-retrieved context.

    FoundationsA new model yields an improved ability to make a prediction. Our assump-tions about chance, described earlier, insist that the perceiving agent has atbest a limited ability to predict the events it perceives. The explanation stagenow enables the agent to make predictions (Sowa, 2000, p. 389). Explanatorysuccess depends on the system’s skills, and both prior and new knowledge.However, these explanations are again limited. Swirski (2000, p. 101) pointsout that to be effective, explanation needs “a stopping rule”—for example, “thestandard causal pattern in the social sciences” requires only “a description ofthe actions and the motivations behind them that were sufficient to producea change in the circumstances.”

    There are different kinds of viable explanations. In the 3M example, the ex-planation focused on “so what,” i.e., on showing that the new adhesive couldbe used to make re-stickable bookmarks, and ultimately, a saleable product.However, explanations need not focus on outcomes. An explanation can be re-lated purely to “how.” For instance, van Andel describes the example of SimchaBlass, who

    “. . . happened to pass a row of trees. He noticed that one of the trees wasmuch taller than the others. On investigation he found that, althoughthe soil around the tree was dry, water was continually dripping from anearby leaking connection in a water pipe.” (van Andel, 1994, p. 640)

    This is a fine “how” explanation: the practical usefulness of Blass’s model aroseonly later. According to Aristotle, the fundamental question that must be ad-dressed is “why?” Falcon (2015): answers are to be demonstrated in terms of“principles and causes” (Aristotle, 1998, Book Gamma, p. 81). But crucially,even an incorrect explanation could turn out to be useful later on: “reliable”explanations are not always correct, or may only be correct within circum-scribed regimes.

    The system creates an explanation of the event for itself. At this stage thesystem is not, in general, aiming to explain its behaviour to someone else,or otherwise make its behaviour transparent (in the sense of Explainable AI

  • 24 Joseph Corneli et al.

    Lane et al. (2005)). Nevertheless we may think of explanation as an exposi-tory device or “framing” (Pease and Colton, 2011) that relies on the system’sability to retrieve a suitable context, and to establish relationships betweenelements of this wider context. Explanatory prowess is not simply a matter ofpaying attention, but depends in particular on having learned “what to payattention to” (Levin, 1975, p. 4). Notice that requirements arising in this stagecan push back on earlier stages. “[T]he methods and assumptions on which asystematic investigation is built selectively focus the researcher’s attention”(Barber and Fox, 1958, p. 131).

    HeuristicsExperiments can have limited scope and still be useful. For example, de la Maza(1994) describes two implementations of a “Generate, Test, and Explain” ar-chitecture. The programs involved used decision trees to connect secondarycontextual information (e.g., macroeconomic indicators) to more elementarydata-driven predictions (e.g., of stock market behaviour). The aim of this workwas solely to “connect the ‘correlations’ uncovered by the generate and testmodule to the causal model provided by the domain theory” (ibid., p. 50).A strategic use for these connections could in principle be found later. Thesystem KEKADA by Kulkarni and Simon (1988) is cited by de la Maza as anexample of a system that can directly refine the domain theory.

    Given a sufficiently rich background, only a small amount of new data isneeded. The term explanation-based learning (Ellman, 1989; Cohen, 1992) de-notes a process in which an explanation of one event leads to a rule thatcan be applied to similar events in the future. This typically requires signifi-cant background knowledge. Imitation learning, learning from demonstrations,learning by example, and one-shot learning are related concepts (see, for exam-ple, Cypher and Halbert (1993)). Case-based reasoning formulates backgroundknowledge as an extensive catalogue of somewhat-similar “cases”: here explana-tion may play a role in determining how two cases match (Aamodt and Plaza,1994, p. 11).

    Learning is less efficient, but more widely applicable, than knowing. The sys-tem Hacker, created by Sussman (1973), was able to “diagnose five classes ofmistake and adapt differentially to them, generalizing its adaptive insightsso that they can be applied to many problems of the same structural form”(Boden, 1984). However,

    “Hacker is not as good at solving blocks world problems as would be amuch simpler program that just goes about it directly with some goodheuristics and a minimum of exploration. Hacker’s justification is asan epistemological model, not as a real problem solver ” (Levin, 1975,p. 17).

    Sussman-style “critics”—which find, fix, and in future avoid bugs—have beenwidely used in the planning literature (Sacerdoti, 1975; Young et al., 1994;Erol et al., 1995; Singh, 2005; Kaelbling and Lozano-Pérez, 2011). For exam-ple, critics have helped create video game characters that make situationally-

  • Modelling Serendipity in a Computational Context 25

    appropriate plans in complex, changing, environments (Hawes, 2001).

    Communication between agents can transfer causal information. Moore (1995)and Cawsey (1992) describe systems that provide explanations to the userin interactive dialogues. Subsequent research compared “mixed-initiative” and“non-mixed-initiative” dialogues using computer simulations (Ishizaki et al.,1999). There are other ways to share and integrate causal information when ithas formal representations (Geiger et al., 2016). As is well known from researchon social dilemmas, thin communication protocols constrain agents’ ability tocooperate; however, sufficiently complex agents can learn to cooperate evenwith limited communication bandwidth (Leibo et al., 2017).

    Definition 5. Bridge: The path and set of mechanisms used to transform thetriggering event and output from subsequent processing steps into a problemto solve. Mechanisms often include reasoning techniques, such as abductiveinference or analogical reasoning, and may rely on new social arrangementsor physical prototypes. The bridging process may have many steps, and mayfeature chance elements.

    FoundationsIt is sometimes necessary or desirable to go beyond explanation. The bridgingprocess can be outlined by comparing a positive example with a correspond-ing counterexample. Nearly 60 years before Fleming, Eugene Semmer bothdiscovered and also cursorily explained the curious effects of penicillium no-tatum—but he did not find a bridge to the vital problem his discovery couldhave solved (Cropley and Cropley, 2013, p. 75). His “methods and assump-tions” (Barber and Fox, 1958, p. 131) constrained his thinking.

    Two cases: pseudoserendipity versus true serendipity. The “eureka” or “aha”moment has been modelled computationally by Thagard and Stewart (2011)using a form of concept blending. These authors assert that “human creativityrequires the combination of previously unconnected mental representationsconstituted by patterns of neural activity” (p. 1). The notion of a bridge issuggested, but such a connection may be a sought finding. The Bergsoniandistinction treated in Section 2 emphasises making a connection not simplybetween representations, but to a novel problem: “[originally] stating the prob-lem is not simply uncovering, it is inventing” (Bergson, 1946, p. 58). Due toits novelty, an original problem cannot be fully known in advance, thoughthe investigator may invent such a problem whilst in quest of something else.Figueiredo and Campos (2001, p. 3) made the distinction between serendip-ity and pseudoserendipity particularly crisp by introducing the “serendipityequations”:

    pseudoserendipity

    P1 ⊂ (KP1)M ⊂ (KM )

    ⇒ S1 ⊂ (KP1,KM ,KN )

    serendipity

    P1 ⊂ (KP1)M ⊂ (KM )

    ⇒P 2 ⊂ (KP2)

    S2 ⊂ (KP2,KM ,KN )

  • 26 Joseph Corneli et al.

    In the pseudoserendipitous case, a given problem P1 in the knowledge domainKP1 becomes solveable (whence, S1) by the addition of additional knowledge,supplied by M . In the serendipitous case, the initial set up is similar, but theresult is not a solution to the original problem: rather, it is a new problem,P 2, together with its solution.

    The bridge is transformational. Although the notation above makes the dis-tinction between the two cases clear, it somewhat disguises the principle thatis common to both. Even in pseudoserendipity, there’s more going on than justnew information coming online which happens to make a problem solveable.Otherwise any online problem-solving system could be seen as pseudoserendip-itous, which is inconsistent with that term’s usage. When putting together amodel aeroplane, this is done piece by piece, and even the order in which thepieces are put into place is more or less predictable. It would not be said thateither the last piece added, nor any of the other pieces that were added alongthe way, was the result of pseudoserendipitous creativity. By contrast, therewould have been ample opportunity for pseudoserendipity to arise in the his-torical development of powered flight: Spenser (2008, p. 292) contends that“none of [the progress in aviation] would have happened if human interactionhadn’t evolved just as dramatically,” which suggests that the process couldnot have been planned in advance. To consider another example, assemblinga jigsaw puzzle is not an entirely predictable process: it involves chance at theoutset, but nevertheless, the overall structure of the solution process is wellunderstood. Even if a previously missing piece was suddenly discovered, whichmade the puzzle solveable, this would not be a bridge, because the problemitself is unchanged. In short, both pseudoserendipitous and serendipitous cre-ativity involve “the transformation of some (one or more) dimension of thespace so that new structures can be generated which could not have arisenbefore” (Boden, 1998, p. 348).

    A good problem can be identified by working at a meta-level. The bridge mightbe thought of as a meta-problem, in other words, a fitness function or “aes-thetic” (Pease and Colton, 2011), through which an entire class of problemsmay be surveyed, and the most suitable one selected (in pseudoserendipity) orinduced (in true serendipity).

    HeuristicsSimilarity, analogy, and metaphor can be used to retrieve known problems. In-stances of pseudoserendipity concern problems that are known to the system.These may be retrieved in a non-transformational way, e.g., via a search pro-cess that uses analogies between the recently-generated explanation and a cat-alogue of existing problems. Sowa and Majumdar (2003) describe three kindsof analogies that apply to graphical knowledge structures: matching types witha common supertype, matching isomorphic subgraphs, and identifying trans-formations that can change the subgraphs of one graph into another. They giveas an example an analogy between a cat and a car, found using WordNet data.In one real-world example, designers at Speedo developed a new material tomake swimmers faster by incorporating a tiny tooth-like network similar to the

  • Modelling Serendipity in a Computational Context 27

    denticles found in the surface of a shark’s skin (Ingledew, 2016). The conceptof “metaphor” emphasises the role of a representational system in expressingan analogy. Xiao et al. (2016) describe one way in which the relevant back-ground that is needed to interpret (or create) metaphors might be acquired.Structure-based retrieval of source domains may give a significant boost tothe creativity of the analogies that can be constructed (Donoghue and Crean,July, 2002).

    Concept blending may, but does not necessarily, help identify new problems.The bridge might be established by concept blending, otherwise known as con-ceptual integration (Fauconnier and Turner, 2008, 1998). This approach fromcognitive science has recently received increased attention in computer science(Confalonieri et al., 2018; Besold et al., 2015; Eppe et al., 2018). The methodforms new combinations of existing concepts—however, Fauconnier and Turner(1998) advise that “the most suitable analog for conceptual integration is notchemical composition but biological evolution.” Even so, blending can also becontrasted with simple models of genetic crossover, where the only commonal-ities that are guaranteed to be preserved are those at at the level of individualmatching alleles. In blending, commonalities are potentially more abstract.Finding analogies can be seen as the first step in the process of concept blend-ing: for example, given the analogy identified by Sowa and Majumdar, multipledifferent cat-car hybrids could be devised, some suitable for nightmares, somefor children’s toys. Like biological evolution, the blending process can involvethe outside world in the specification and evaluation of blends, and it cando this in ways that combinatorial search does not. Eppe et al. (2018) haveimplemented several standard-use heuristics that can be used to give basic as-sessments to various blends, but in general blends are evaluated contextually.Thagard and Stewart evaluate blends using an abstract simulated model of“cognitive appraisal and physiological perception” which stands for an overallemotional reaction (Thagard and Stewart, 2011, p. 11). The emotions them-selves represent circumstances which might be in some sense novel, howeverthey might just as well represent a known problem. Thagard and Stewart fo-cus on “problem solving” rather than problem specification: for them, the “ahamoment” occurs when there is a good match between the newly-generated com-bination and the background emotions. Returning to the 3M example, stickynotes appeared as a particularly satisfactory blend between re-stickable book-marks and the known problem of affixing notes to documents. The existence ofthe bookmark prototype allowed a new problem to be specified: how to attacha note in a way that would not damage the document, and would not require aseparate fastener. This problem likely would never have been considered if theonly solutions to hand were the previously existing conventional technologiesof staples, paperclips, and standard-formula glue. It was an eureka momentfor Arthur Fry because he had in mind the problem of coming up with a newproduct: but the product itself appeared hand-in-hand with a new problem.The invention of Velcro™ can similarly be reconstructed as a blend, in whichthe biological problem of seed propagation, and its solution of tiny hooks, isblended with the domain of fashion to bridge to a new problem: could clothes

  • 28 Joseph Corneli et al.

    be conveniently fastened using a hook-and-loop mechanism? We note that deMestral had to expend considerable further effort before he was able to answerthis question in the affirmative. This example serves to illustrate that a fullsolution does not always emerge at the same time as the problem.

    Working across domains can give rise to intriguing ideas. Text mining has beenused to generate hypotheses by first identifying bridging terms between differ-ent bodies of literature (Swanson and Smalheiser, 1997; Weeber et al., 2001;Juršič et al., 2012b,a). These methods may be employed in closed discoverymodels where the “two domains of interest . . . are identified by the expertprior to starting the knowledge discovery process” or in open discovery modelswhere the process works “from a given starting domain towards a yet unknownsecond domain” (Juršič et al., 2012a). These correspond, more or less, to thecases of pseudoserendipity and serendipity.

    Experiments can give surprising insights. Experiments have been designed us-ing both classic expert system methods (Lorenzen et al., 1992) as well as mod-ern reinforcement learning techniques (Melnikov et al., 2018). However, it isnot clear if any software systems are yet looking for bridges between experi-ments, which would allow them to make use of the fact that interesting thingscan be learned when a method is applied “in just a slightly different way”(Austin, 1978, p. 28), and specialisations of this observation, such as “the un-expected yield from a control experiment may be more fruitful than that fromthe main experiment” (p. 32).

    Definition 6. Evaluation: The process results in a product, artefact, pro-cess, theory, use for a material substance, support of a known hypothesis, asolution to a known problem, a new hypothesis or problem, or some otheroutcome. This result is evaluated positively by the system or some externalparty.

    FoundationsAffection is based on reflection. Campbell (2005) highlights the idea of “ra-tional exploitation” and the “discovery of something useful or beneficial” askey aspects of serendipity. But some processing may be required to get tothat point. Here we may refer to the Bergsonian distinction between “per-ceptions” and “affections” (Deleuze, 1988, p. 23). Affection is the “feeling inthe instant”, which is “‘alloyed’ to other subjectivities [. . . ] as we understandwhat we feel and act upon it” (Sutton and Martin-Jones, 2008, p. 141). Inparticular, Bergson (1991, p. 17) considers affections to be directly linked tothe self-knowledge a being has of its body. A system’s evaluation of the newstate of affairs brought about by the processing stages outlined in Defini-tions 1–5 might be described as “affective” when a new system configurationis brought about that is then assessed in some reflexive way. Raw somestheticsense—e.g., an architecture inspired by the instrumentation of robotic jointswith hardwired position sensors—might be alloyed with “reflective thinking”(Singh and Minsky, 2005) that considers global aspects of the configurationand course of action that led to this point.

  • Modelling Serendipity in a Computational Context 29

    HeuristicsModel a sense of taste. The system’s taste is explicitly modelled in the caseof the artworks evolved by the Curious Design Agents described by Saunders(2007).

    Allow the system to use the world. As an alternative route to working withaffect, a system might outsource emotional processing to a human user, “recog-nise” the user’s affective expression (Picard, 1995, p. 15), and use that as thebasis of an evaluation.

    Allow the system to shape its own goals. Whether or not the user is given arole in the evaluation process, systems may be designed to shape their owngoals (Kaplan and Oudeyer, 2007; Singh et al., 2010).

    3.3 Summary

    We have proposed a phased model of serendipity consisting of several cognitivecomponents. We began the section with a schematic diagram for a computa-tional system that integrates all of these components (Figure 1). We thendefined each component with reference to theoretical literature and existingsoftware implementations, and, where they could add further clarity, illustra-tive historical examples. Table 2 summarises the model that results from thisanalysis, highlighting examples of earlier work that support our definitions andthat show the feasibility of the overall proposal.

  • Perception Attention Focus Shift Explanation Bridge Valuation

    Interface to world

    • Russell and Norvig -different kinds ofenvironments• Hume/Peirce -chance isnegative/fundamental• Hoffman et al. -adaptivity of notseeing reality as it is• Friston - we sensewhat we can predict

    Directed processingpower

    • Clark - predictionerror• Singh and Minsky -layered architecture• Bateson - changingbehaviour• Rowley - meaningmaking

    Evaluation of datavia existingobjective functions

    • Wundt curve• Berlyne - epistemicand perceptualcuriosity• Logan et al. - beliefrevision in informationseeking• Patalano et al. -predictive encodings

    A predictive model

    • Aristotle - principlesand causes• Pease and Colton -framing• Bateson - change ofpattern

    Identifying orpositing a problem(via a newobjective function)

    • Bergson - creativityof problem statement•Thagard and Stewart- “aha moment”• Boden - transformthe space• Pease and Colton -new aesthetic

    Evaluation ofsolution viaexisting objectivefunction

    • Bergson - affection• Campbell - rationalexploitation

    HCI, automatedfeature finding,emergence of gridcells

    • Turk• Jacob and Magerko• Stopher and Smith• Mordvintsev et al.• Banino et al.

    Visual attention,competition forresources, temporalbonus, softattention

    • Sun and Fisher• Tsotsos