Competition between adaptive agents: learning and collective efficiency
description
Transcript of Competition between adaptive agents: learning and collective efficiency
![Page 1: Competition between adaptive agents: learning and collective efficiency](https://reader035.fdocuments.in/reader035/viewer/2022062802/568145b1550346895db2b13f/html5/thumbnails/1.jpg)
Competition between adaptive agents: learning and collective efficiency
Damien Challet
Oxford University
Matteo Marsili
ICTP-Trieste (Italy)
● My definition of the Minority Game
● Simple worlds (M= 0)
●Markovian behavior
●Neural networks
●Reinforcement learning
● Multistate worlds (M> 0)
● Cause of large inefficiencies
● Remedies
● From El Farol to MG and back
![Page 2: Competition between adaptive agents: learning and collective efficiency](https://reader035.fdocuments.in/reader035/viewer/2022062802/568145b1550346895db2b13f/html5/thumbnails/2.jpg)
'Truth is always in the minority'
Kierkegaard
![Page 3: Competition between adaptive agents: learning and collective efficiency](https://reader035.fdocuments.in/reader035/viewer/2022062802/568145b1550346895db2b13f/html5/thumbnails/3.jpg)
Zig-Zag-Zoug
● Game played by Swiss children
● 3 players, 3 feet, 3 magic
words
●“Ziiig” ... “Zaaag” .... “ZOUG!”
![Page 4: Competition between adaptive agents: learning and collective efficiency](https://reader035.fdocuments.in/reader035/viewer/2022062802/568145b1550346895db2b13f/html5/thumbnails/4.jpg)
Minority Game
● Zig-Zag-Zoug with N players● Aim: to be in the minority● Outcome = #UP-#DOWN = #A-#B● Model of competition between adaptive players
Challet and Zhang (1997), from El Farol's bar problem (Arthur 1994)
![Page 5: Competition between adaptive agents: learning and collective efficiency](https://reader035.fdocuments.in/reader035/viewer/2022062802/568145b1550346895db2b13f/html5/thumbnails/5.jpg)
Initial goals of the MG
El Farol (1994): impossible to understand
Drastic simplification, keeping key ingredients
Bounded rationality
Reinforcement learning
Symmetrize the problem: 60/100 -> 50/50
Understand the symmetric problem
Generalize results to the asymmetric problem
![Page 6: Competition between adaptive agents: learning and collective efficiency](https://reader035.fdocuments.in/reader035/viewer/2022062802/568145b1550346895db2b13f/html5/thumbnails/6.jpg)
Repeated games
Why playing again ?
Frustration
Losers in majority
How to play ?
Deduction
Rationality
Best answer
All lose !
Induction
Limited capabilities
Beliefs, strategies,personality
Trial and error
Learning
![Page 7: Competition between adaptive agents: learning and collective efficiency](https://reader035.fdocuments.in/reader035/viewer/2022062802/568145b1550346895db2b13f/html5/thumbnails/7.jpg)
Minority Game
a1(t)a2(t)
aN(t)
...
A(t)=iai(t)
Payoff player i
-ai(t)A(t)
N agents i=1, ..., N
Choice ai (t) +1
-1
Total losses = A2
![Page 8: Competition between adaptive agents: learning and collective efficiency](https://reader035.fdocuments.in/reader035/viewer/2022062802/568145b1550346895db2b13f/html5/thumbnails/8.jpg)
Markovian learning'If it ain't broken, don't fix it' (Reents et al., Physica A 2000:
If I won, I stick to my previous choice
If I lost, I change to the other choice with prob p
Results: ( 2= < A> 2 )
● pN = x = cst (small p): 2 = 1 + 2x (1+ x/6)
● p~ N 1/2 2 ~ N
● p~ 1 2 ~ N 2
![Page 9: Competition between adaptive agents: learning and collective efficiency](https://reader035.fdocuments.in/reader035/viewer/2022062802/568145b1550346895db2b13f/html5/thumbnails/9.jpg)
Markovian learning II
Problem: if N unknown, p= ?
Try: p= f(t) e.g. p= t-k
Convergence for any N
Freezing
When to stop ?
![Page 10: Competition between adaptive agents: learning and collective efficiency](https://reader035.fdocuments.in/reader035/viewer/2022062802/568145b1550346895db2b13f/html5/thumbnails/10.jpg)
Neural networks
Simple perceptrons, learning rate R (Metzler ++ 1999)
2 = N + N(N-1)F(N,R)min
2 = N (1-2/) = 0.363... N
![Page 11: Competition between adaptive agents: learning and collective efficiency](https://reader035.fdocuments.in/reader035/viewer/2022062802/568145b1550346895db2b13f/html5/thumbnails/11.jpg)
Reinforcement learning
● Each player has a register Di
● Di> 0 + is better
● Di< 0 - is better
● Di(t+1) = Di(t) – A(t)
● Choice: prob(+ | Di) = f(Di) f '(x) > 0 (RL)
![Page 12: Competition between adaptive agents: learning and collective efficiency](https://reader035.fdocuments.in/reader035/viewer/2022062802/568145b1550346895db2b13f/html5/thumbnails/12.jpg)
Reinforcement learning II
● Central result:
agents minimize < A> 2 (predictability) for all f
● Stationary state: < A> = 0
● Fluctuations = ?
● Ex: f(x)=(1+tanh(K x))/2 exponential learning, K
learning rate
●K< Kc ~ N
●K> Kc 2~ N2
![Page 13: Competition between adaptive agents: learning and collective efficiency](https://reader035.fdocuments.in/reader035/viewer/2022062802/568145b1550346895db2b13f/html5/thumbnails/13.jpg)
Market Impact: each agent has an influence on the outcome
● Naive agents: payoff - A = - A-i -a i
● Non-naive agents: payoff - A + c a i
● Smart agents: payoff - A-i
cf WLU, AU
● Central result 2:
non-naive agents minimize < A2> (fluctuations) for all
f
-> Nash equilibrium
Reinforcement learning III
~ 1
![Page 14: Competition between adaptive agents: learning and collective efficiency](https://reader035.fdocuments.in/reader035/viewer/2022062802/568145b1550346895db2b13f/html5/thumbnails/14.jpg)
Summary
Rate Markov NN RL naive RL non-naive NN non-naive
Small 1 N N 1 1?
Medium N 1 1?
Large 1 1?N2 N 2 N 2
![Page 15: Competition between adaptive agents: learning and collective efficiency](https://reader035.fdocuments.in/reader035/viewer/2022062802/568145b1550346895db2b13f/html5/thumbnails/15.jpg)
Minority Games with memory
If an agent believes that the outcome depends on the past results, the outcome will depend on the past results.
Sun spot effect
Self-fulfilling prophecies
Fallacies of casual inference
Consequence:
The other agents will change their behavior accordingly
![Page 16: Competition between adaptive agents: learning and collective efficiency](https://reader035.fdocuments.in/reader035/viewer/2022062802/568145b1550346895db2b13f/html5/thumbnails/16.jpg)
=P/N
2/N
Minority Games with memory: naïve agents
Fixed randomly drawn strategies = quenched disorder
Tools of statistical physics give the exact solution in
principle
Agents minimize the predictability
Predictability = Hamiltonian
Optimization problem
Numeric:
Savit++ PRL99
Analytic:
Challet++ PRL99
Coolen+ J. Phys A 2002
?
![Page 17: Competition between adaptive agents: learning and collective efficiency](https://reader035.fdocuments.in/reader035/viewer/2022062802/568145b1550346895db2b13f/html5/thumbnails/17.jpg)
Minority Games with memory: low efficiency
= P/N
![Page 18: Competition between adaptive agents: learning and collective efficiency](https://reader035.fdocuments.in/reader035/viewer/2022062802/568145b1550346895db2b13f/html5/thumbnails/18.jpg)
Minority Games with memory: low efficiency
P/N is not the right scaling for large fluctuations
![Page 19: Competition between adaptive agents: learning and collective efficiency](https://reader035.fdocuments.in/reader035/viewer/2022062802/568145b1550346895db2b13f/html5/thumbnails/19.jpg)
Minority Games with memory: origin of low efficiency
Stochastic dynamical equation for strategy score Ui
slow varying part + correlated noise
I: Size independent II = K P -1/2
When I << II, large fluctuations
Transition at I / K = G / P 1/2
Critical signal to noise ratio = G / P 1/2
![Page 20: Competition between adaptive agents: learning and collective efficiency](https://reader035.fdocuments.in/reader035/viewer/2022062802/568145b1550346895db2b13f/html5/thumbnails/20.jpg)
Minority Games with memory: origin of low efficiency
Check:
Determine G
Predict critical points
I/K
G / P 1/2
![Page 21: Competition between adaptive agents: learning and collective efficiency](https://reader035.fdocuments.in/reader035/viewer/2022062802/568145b1550346895db2b13f/html5/thumbnails/21.jpg)
Minority Games with memory: origin of low efficiency
BEFOREAFTER
![Page 22: Competition between adaptive agents: learning and collective efficiency](https://reader035.fdocuments.in/reader035/viewer/2022062802/568145b1550346895db2b13f/html5/thumbnails/22.jpg)
Minority Games with memory: origin of low efficiency
![Page 23: Competition between adaptive agents: learning and collective efficiency](https://reader035.fdocuments.in/reader035/viewer/2022062802/568145b1550346895db2b13f/html5/thumbnails/23.jpg)
Minority Games with memory: sophisticated agents
Agents minimize fluctuations
Optimization problem again
![Page 24: Competition between adaptive agents: learning and collective efficiency](https://reader035.fdocuments.in/reader035/viewer/2022062802/568145b1550346895db2b13f/html5/thumbnails/24.jpg)
Reverse problem
Many variations, different global utility functions
● Grand canonical game (play or not play)
●Time window of scores (exponential moving
average)
●Any payoffHence, given a task (global utility function),
one knows how to design agents (local utility).
example: optimal defects combinations (cf. Neil's
talk)
![Page 25: Competition between adaptive agents: learning and collective efficiency](https://reader035.fdocuments.in/reader035/viewer/2022062802/568145b1550346895db2b13f/html5/thumbnails/25.jpg)
From El Farol to MG and back
El Farol
0 NL
MG
0 NL = N/2
Differences, similarities?
Which results from MG are valid for El Farol?
![Page 26: Competition between adaptive agents: learning and collective efficiency](https://reader035.fdocuments.in/reader035/viewer/2022062802/568145b1550346895db2b13f/html5/thumbnails/26.jpg)
From El Farol to MG and back
0 NL
Theorem: all results from MG apply to El Farol
N< a>
Everything scales like (L/N – < a>)/ = P ½
The El Farol problem with P states of the world is solved.
![Page 27: Competition between adaptive agents: learning and collective efficiency](https://reader035.fdocuments.in/reader035/viewer/2022062802/568145b1550346895db2b13f/html5/thumbnails/27.jpg)
From El Farol to MG and back:new results
If (L/N – < a>)/ = P ½ 0,
P>Pc = 2 / [(L/N-< a>)2]: no more phase transition.
![Page 28: Competition between adaptive agents: learning and collective efficiency](https://reader035.fdocuments.in/reader035/viewer/2022062802/568145b1550346895db2b13f/html5/thumbnails/28.jpg)
Summary•AU/WLU suppresses large fluctuations -> Nash equilibrium
•Design: agents must know they have an impact.
•The knowledge of the exact impact not crucial
•Reverse problem also possible
•MG: simple, rich, fun, and usefulwww.unifr.ch/econophysics/minority
102 commented references