Download - From bounded rationality to learning Bernard WALLISER (Paris School of Economics) Rationality, Heuristics and Motivation in Decision Making, Pisa, November.

From bounded rationalityto learning

Bernard WALLISER

(Paris School of Economics)Rationality, Heuristics and Motivation in Decision Making,

Pisa, November 12-14, 2010

Introduction (1)¤ Simon’s problem decision considered as a reasoning process + limited capacities of information gathering and treatment → bounded rationality procedures (satisficing, ….) But how are these procedures precisely related to the assumptions?

¤ Meta-optimization paradox if the decision-maker optimizes the gathering of information (Winter) the best choice procedure (Mongin-Walliser) he needs: to have previous information on available information to take into account the computing costs Hence, meta-optimization is engaged in an infinite regress which

happens to be a vicious one (Mongin-Walliser)

Introduction (2)¤ Bounded rationality was initially defined in a static way:- fixed environment- fixed beliefs and preferences- fixed choice rule (combining beliefs and preferences)

¤ Learning processes later introduce partial dynamics :- non stationary environment (game)- revised beliefs when new information comes in- endogenous preferences when more experience is acquired- adaptive choice rule

→ comparison by distinguishing information gathering, information treatment and choice

→ treatment in various contexts (epistemic logics, decision theory)

Statics: limited information on context

¤ uncertainty on context - plainly probabilistic - hierarchical but always probabilistic: ambiguity - non probabilistic: qualitative probabilities, belief functions Ex: Choquet utility maximization (Gilboa-Schmeidler)¤ unawareness (not knowing and not knowing that not knowing)

- treated in epistemic logics Ex: precautionary principle

¤ limited crossed beliefs - p-accuracy reasoning

- k-level reasoning - crossed awareness (Meier et al) Ex: cognitive hierarchy model (Camerer)

Statics: limited information on preferences

¤ multidimensional preferences - preferences are decomposed and simplified Ex: satisficing (Simon), elimination by aspects (Tversky)

¤ random preferences - preferences correspond to alternative ‘moods’ Ex: discrete choice model, quantal model

¤ context-dependent preferences - situated preferences Ex: context (history)-dependent aspiration levels for satisficing - reference points (statu quo, norm) Ex: reference for gains vs losses in EU

Statics: simplified choice rule

¤ limited logical omniscience - treated in epistemic logics Ex: satisficing (sequential examination of actions) ?

¤ finite number of internal states - simple expression of ‘computation complexity’ Ex: finite automata

¤ computation costs - approximate cost of mental calculus Ex: basic operations

Dynamics: information research on context

¤ exogenous information, - resulting from purchase at specialized institutes and characterized

by its value (opposed to its cost) Ex: signals about actual state (correlated to it) → limited relevance

¤ endogenous free information - resulting from repeated observation Ex: observation of other’s action in fictitious play → memory constraints → scope constraints (information neighbourhood)

¤ endogenously induced information - resulting from voluntary (suboptimal) action and characterized by its

value (opposed to loss of utility) Ex: search procedures → ambiguous interpretation

Dynamics: information research on own’s preferences

¤ observation of own’s past utility of actions

- assuming that choice utility= (expected) felt utility

Ex: CPR model

→ partial preferences (incompleteness)

¤ observation of other’s utility of actions

- assuming that other’s utility = own’s utility (in same situations)

Ex: imitation of successful opponents

→ biased preferences

Dynamics: treatment of information about context

¤ expectation process - especially of other’s strategy - stationarity assumption → extrapolative expectation Ex: fictitious play (probability = frequency of past actions)

¤ belief revision procedure - 3 contexts: updating, revising, focusing - possibility of contradiction between initial belief and message → simplified or distorted Bayes rule (judgment biases) Ex: weight between initial belief and message

¤ reconstruction of structural information - 3 types of information: factual (past), structural (constant), strategic (future) - pattern recognition (trends, cycles) - revelation of other’s preferences (abductive process) Ex: reputation effect

Dynamics: treatment of information about own’s preferences

¤ performance indices - (average or cumulative) index for each action - stationarity assumption → proxy for utility function Ex: CPR rule

¤ adaptation of aspiration levels - adaptive level for global index (for instance, best past utility) → proxy for utility level Ex: dynamic satisficing (Simon)

¤ reconstruction of structural information - design of relative vs absolute preferences Ex: regret matching (unconditional regret index: difference in the

past between utility when using a given strategy and utility really obtained against others’ implemented strategies)

Dynamics: adaptive choice rule (1)¤ inertial behaviour - repeated action if sufficient past payoff Ex: reaction to aspiration levels (continue if levels are reached)

¤ exploration behaviour - random exploration, fixed or decreasing - directed exploration Ex: randomized fictitious play

¤ exploitation behaviour - quasi optimizing behaviour

Ex: fictitious play

¤ stochastic reinforcement behaviour - noisy best response

- stochastic matching (probabilistic behaviour monotonic with utility) → implicit exploration-exploitation dilemma Ex: CPR (decreasing exploration)

Dynamics: adaptive choice rule (2)

¤ imitation - grounded on complementary preferences (preferential mimetism) - grounded on information differences (informational mimetism) - grounded on better experience (experienced mimetism) Ex: plain diffusion model imitation of successful opponents

¤ analogy-based reasoning - previous contexts (case-based reasoning) - repetitive game structures Ex: case-based rule (Gilboa-Schmeidler) analogical equilibrium (Jehiel)

Dynamics: adaptive choice rule (3)

¤ restricted choice rules - specific action set, for instance unidimensional Ex: stubborn rule (Laslier-Walliser) - specific beliefs, for instance objective probabilities Ex: stopping rules in search - specific preferences, for instance multicriteria choice Ex: choice 2 by 2 + synthesis

¤ context- adaptive choice rules - parlour games: chess, cards, Cluedo Ex: keep pawns tight - sports Ex: throwing a ball with constant angle - labyrinth, puzzles Ex: keep right rule

Asymptotic results¤ system’s trajectory - transitory state - asymptotic state (speed of convergence) → different time scales (role of random shocks)

¤ convergence of expectations - towards locally rational ones ¤ convergence of actions (or strategies) - elimination of (strictly) dominated strategies, - convergence notions towards equilibrium states - convergence in time-average or action by action Ex: fictitious play - convergence towards a unique or multiple (point)-equilibrium

(selection when exploration vanishes) - cyclical and chaotic attractors

Conclusion (1)¤ dispersed models, even if two main classes (grounded

on cognitive capacities) - belief-based learning Ex: fictitious play - reinforcement learning Ex: CPR

¤ combination of models - models depending on choice context and results - hybrid models

- models with heterogenous agents ¤ need to consider the precise reasoning modes followed

by agents: - counterfactual reasoning (simulation of opponents) - abductive reasoning (detection of structural or behavioral regularities) - analogical reasoning (situations treated as similar) - taxonomical reasoning (categorization)

Conclusion (2)

¤ possibility of meta-learning - belief revision rule Ex: parameter trading off initial belief and message in extended Bayes rule - preferences Ex: degree of altruism in individual preferences - choice rule Ex: parameter in logit rule → learning levels give again rise to an infinite regress (in order to solve it,

highest level has to be given)

¤ infinite regress stopped by evolution process, but - mix of evolutive process (capacities and constraints imposed by evolution)

and cultural process (capacities and constraints conditioned by society) - very slow time scale (against fluctuating environment) - concrete mechanism not exhibited