Joost Broekens, Doug DeGroot, LIACS, University of Leiden, The Netherlands Emergent Representations...

18
Joost Broekens, Doug DeGroot, LIACS, University of Leiden, The Netherlands Emergent Representations and Reasoning in Adaptive Agents Joost Broekens, Doug DeGroot Leiden University, LIACS, Leiden. {broekens, degroot}@liacs.nl

Transcript of Joost Broekens, Doug DeGroot, LIACS, University of Leiden, The Netherlands Emergent Representations...

Page 1: Joost Broekens, Doug DeGroot, LIACS, University of Leiden, The Netherlands Emergent Representations and Reasoning in Adaptive Agents Joost Broekens, Doug.

Joost Broekens, Doug DeGroot, LIACS, University of Leiden, The Netherlands

Emergent Representations and Reasoning in Adaptive Agents

Joost Broekens, Doug DeGroot

Leiden University, LIACS, Leiden.

{broekens, degroot}@liacs.nl

Page 2: Joost Broekens, Doug DeGroot, LIACS, University of Leiden, The Netherlands Emergent Representations and Reasoning in Adaptive Agents Joost Broekens, Doug.

Joost Broekens, Doug DeGroot, LIACS, University of Leiden, The Netherlands

Overview

• Introduction

• Interactivism

• Hypothesis

• Computational Model based on Interactivist Concepts

• Experiments

• Results

• Conclusion

• Questions?

Page 3: Joost Broekens, Doug DeGroot, LIACS, University of Leiden, The Netherlands Emergent Representations and Reasoning in Adaptive Agents Joost Broekens, Doug.

Joost Broekens, Doug DeGroot, LIACS, University of Leiden, The Netherlands

Introduction

• Adaptive Agents:– Flexible models of the world. (continuous online learning).

– Efficient memory retrieval.

– Efficient relevant reasoning context (how to select relevant information from a large collection of beliefs)

– How to represent knowledge?

– What is reasoning?

Page 4: Joost Broekens, Doug DeGroot, LIACS, University of Leiden, The Netherlands Emergent Representations and Reasoning in Adaptive Agents Joost Broekens, Doug.

Joost Broekens, Doug DeGroot, LIACS, University of Leiden, The Netherlands

Interactivism (1/3)

• Interactivism (Bickhard) proposes:– Coupling of (properties of) situations and actions possible in that

situation: Interaction Potential (IP)– IP concept as primitive for representations.– Potential Interactions are prepared by prior interactions.

An IP is conditional on prior interactions• Example: brush

– IPs are organized in a hierarchical web-like fashion.– Parts of this web remain invariant under many other interactions

• Example: brush

– IPs stabilize and destabilize based on correct prediction/preparation

Page 5: Joost Broekens, Doug DeGroot, LIACS, University of Leiden, The Netherlands Emergent Representations and Reasoning in Adaptive Agents Joost Broekens, Doug.

Joost Broekens, Doug DeGroot, LIACS, University of Leiden, The Netherlands

Interactivism (2/3)

shower

dry

work

got homebrush/desk

brush Put away

brush/desk

Put away brush

Time

Page 6: Joost Broekens, Doug DeGroot, LIACS, University of Leiden, The Netherlands Emergent Representations and Reasoning in Adaptive Agents Joost Broekens, Doug.

Joost Broekens, Doug DeGroot, LIACS, University of Leiden, The Netherlands

Interactivism (3/3)

• Interactivism and Reasoning.– Model-learning: (de)stabilization of IPs through continuous

interaction with the world constructs representations of the world.• Representations have implicit content (certain properties of a situation

a allows for x,y interactions, making a different from situation b lacking these properties).

• Truth value (I tried an interaction x, but y happened, so it was not x).

– Task-learning: preference between at least two interactions based on bias.

• Reinforcement signal.

• So: an IP has (at least) two properties: stability and expected return.

Page 7: Joost Broekens, Doug DeGroot, LIACS, University of Leiden, The Netherlands Emergent Representations and Reasoning in Adaptive Agents Joost Broekens, Doug.

Joost Broekens, Doug DeGroot, LIACS, University of Leiden, The Netherlands

Hypothesis

• Reasoning and Decision making are emergent properties of interactivist representational systems.– Create a computational model strictly based on interactivist

assumptions.

– Create a task that needs a decision by the agent.

• Minimal reasoning:– “any observable behavior that reflects a beneficial decision

between at least two possibilities that is neither explicable due to chance, nor without representations”.

Page 8: Joost Broekens, Doug DeGroot, LIACS, University of Leiden, The Netherlands Emergent Representations and Reasoning in Adaptive Agents Joost Broekens, Doug.

Joost Broekens, Doug DeGroot, LIACS, University of Leiden, The Netherlands

Computational model (1/3)

• Basis: hierarchical directed graph.

• The agent’s actions and stimuli from the world are assumed to be the same kind of information.

• Nodes represent interactions.– Nodes can be active (used) or prepared (hypothesized).

– Primary nodes: stimulus (action or stimulus from the world).

– Secondary nodes: interaction potentials.

– Hierarchy of secondary nodes: IP hierarchy.

Page 9: Joost Broekens, Doug DeGroot, LIACS, University of Leiden, The Netherlands Emergent Representations and Reasoning in Adaptive Agents Joost Broekens, Doug.

Joost Broekens, Doug DeGroot, LIACS, University of Leiden, The Netherlands

Computational Model (2/3)

• Example (1, 2 = location in a maze, d = down):– Model is empty at startup.

– a: agent goes down, and builds node for “down”

– b: agent arrives at location 1, and builds interaction

– c: agent goes down, and builds interaction.

– d: agent arrives at location 2, and builds interactions

dD 1 2

1-DD-1

(D-1)-D

((D-1)-D)-2

(1-D)-2

D-2

DcD 1

1-DD-1

(D-1)-D

DbD 1

D-1

aD

Page 10: Joost Broekens, Doug DeGroot, LIACS, University of Leiden, The Netherlands Emergent Representations and Reasoning in Adaptive Agents Joost Broekens, Doug.

Joost Broekens, Doug DeGroot, LIACS, University of Leiden, The Netherlands

Computational Model (3/3)

• Model learning and task-learning: exposure (continuous interaction) and reinforcement.

• Exposure (local):– Build conditional probabilistic model of the environment, but only

adapt locally: count activations of IPs.– If usage of IP is lower than arbitrary threshold, throw away node.

• Reinforcement (local):– Update active IPs with current reinforcement signal.– Propagate reinforcement through IP hierarchy based on local

probabilities of the environment, only use prepared IPs.

• Biased selection:– Propose action based on WTA selection of proposed interactions.

Page 11: Joost Broekens, Doug DeGroot, LIACS, University of Leiden, The Netherlands Emergent Representations and Reasoning in Adaptive Agents Joost Broekens, Doug.

Joost Broekens, Doug DeGroot, LIACS, University of Leiden, The Netherlands

Experiments (1/3)

• Model learning: does the agent learn an adaptive model of the environment?– Test for reuse of old information in new situation (a, b,c,d, e).

– Test for quick adaptation to a new maze (a, b, e).

• Maze setup:

c

ea b c d

Black: agentRed: lava (Rf=-1)Yellow: food (Rf=1)

Page 12: Joost Broekens, Doug DeGroot, LIACS, University of Leiden, The Netherlands Emergent Representations and Reasoning in Adaptive Agents Joost Broekens, Doug.

Joost Broekens, Doug DeGroot, LIACS, University of Leiden, The Netherlands

Experiments (2/3)

• Selection task (simple reasoning): Is the agent able to make a beneficial informed decision.– Chose between two options, choice can be made only if there is

knowledge (representation) about the other option (informed choice). (d, b, f)

– Test for convergence in a randomly changing situation (g).

• Maze setup:

d b f gBlack: agentRed: lava (Rf=-1)Yellow: food (Rf=1)

Page 13: Joost Broekens, Doug DeGroot, LIACS, University of Leiden, The Netherlands Emergent Representations and Reasoning in Adaptive Agents Joost Broekens, Doug.

Joost Broekens, Doug DeGroot, LIACS, University of Leiden, The Netherlands

Experiments (3/3)

• Ran experiments for different maze setups :– 30 runs per setup.

– In every run the agent has 100 trials to find the food.

– Max 1000 steps per trial.

• Plotted average learning curves of the trails over the 30 runs.

Page 14: Joost Broekens, Doug DeGroot, LIACS, University of Leiden, The Netherlands Emergent Representations and Reasoning in Adaptive Agents Joost Broekens, Doug.

Joost Broekens, Doug DeGroot, LIACS, University of Leiden, The Netherlands

Results (1/3)

• Agent learns adaptive model of the environment and reuses information:

020406080100120140

1 8 15 22 29 36 43 50 57 64 71 78 85 92 99

020406080100120140

1 8 15 22 29 36 43 50 57 64 71 78 85 92 99

020406080100120140

1 8 15 22 29 36 43 50 57 64 71 78 85 92 99

Page 15: Joost Broekens, Doug DeGroot, LIACS, University of Leiden, The Netherlands Emergent Representations and Reasoning in Adaptive Agents Joost Broekens, Doug.

Joost Broekens, Doug DeGroot, LIACS, University of Leiden, The Netherlands

Results (2/3)

• Agent learns to make a beneficial decision at the crossing.

020406080100120140

1 9

17 25 33 41 49 57 65 73 81 89 97

020406080100120140

1 17 33 49 65 81 97 113

129

145

161

177

193

Page 16: Joost Broekens, Doug DeGroot, LIACS, University of Leiden, The Netherlands Emergent Representations and Reasoning in Adaptive Agents Joost Broekens, Doug.

Joost Broekens, Doug DeGroot, LIACS, University of Leiden, The Netherlands

Results (3/3)

• A representation of a potential food location is learned:– The agent is able to try one location, and if the food is not there,

try a second one.– This means the agent has a stable representation of “food is not

here”.– Representation: content (food), truth value (food not here).

• The ability to make an informed choice indeed emerges from an Interactivism based model:– The agent learns what a crossing is and how to use it:

• The concept of a crossing is not introduced in the model.• The agent chooses a different action the second time it arrives at the

crossing only if food has not been found earlier (informed choice).

Page 17: Joost Broekens, Doug DeGroot, LIACS, University of Leiden, The Netherlands Emergent Representations and Reasoning in Adaptive Agents Joost Broekens, Doug.

Joost Broekens, Doug DeGroot, LIACS, University of Leiden, The Netherlands

Conclusion

• Interactivist based models are useful for the computational investigation of knowledge representation and reasoning in agents.

• Representations and reasoning can indeed emerge from a computational model based on interactivist assumptions when used in an agent that continuously interacts with the environment.

• Future work:– literature search into machine learning mechanisms– “imagination”.– Neuronal implementation.

Page 18: Joost Broekens, Doug DeGroot, LIACS, University of Leiden, The Netherlands Emergent Representations and Reasoning in Adaptive Agents Joost Broekens, Doug.

Joost Broekens, Doug DeGroot, LIACS, University of Leiden, The Netherlands

Questions?