Lecture about Agents that Learn 3rd April 2000 INT4/2I1235.

Lecture about Agents that Learn

• 3rd April 2000

• INT4/2I1235

Agenda

• Introduction• Centralized learning vs decentralized learning• Credit Assignment Problem• Learning and Activity Coordination• Learning about and from other agents• Learning and Communication• Summary

Introduction

• Todays topic• Who is the lecturer• Why do we have this lecture

Todays topic

• How do agents learn?• What are the benefits of learning agents?• Learning in isolation, or in cooperation?

Who is the lecturer

• Johan Kummeneje• Doctoral Student• RoboCup, Social Decisions, and Java

Why do we have this lecture

• Beats me….. You tell me.

• Take 2 minutes to think about why this is interesting, and then I will ask 2 or 3 of you what you think.

Agenda

• Introduction

• Centralized learning vs decentralized learning

• Credit Assignment Problem

• Learning and Activity Coordination

• Learning about and from other agents

• Learning and Communication

• Summary

Centralized vs Decentralized

• Introduction• The Degree of Decentralization• Interaction-specific features• Involvement-specific features• Goal-specific features• The learning method• The learning feedback

Introduction

• Learning process => planning, inference, decision steps etc.

• Centralized learning or isolated learning• Decentralized learning or interactive

learning

The Degree of Decentralization

• Distributedness• Paralellism

Interaction-specific features

• Level of interaction ( ”simple” observation to complex negotiations and dialogues)

• Persitence of interaction (short-long)• Frequency (low -high)• Pattern ( unstructured- hierarchical)• Variability (fixed - dynamic)

Involvement-specific features

• Relevance to the learning process• Role in the learning process• Generalist-- Specialist

Goal-specific features

• Improvement (Individual vs Social)• Conflict vs Compatible Goals

The learning method

• Rote learning (”Korvstoppning”)

• Instructed and adviced

• Examples and practice (Learning by Doing, Baden-Powell)

• Analogy

• Discovery

Efforts increase from top to bottom.

The learning feedback

• Supervised (tells which action that is the best)

• Reinforcement (maximizing the utility of action)

• Unsupervised (no explicit feedback)

Agenda

• Introduction

• Summary

Credit Assignment Problem

• Inter Agent CAP (how to divide credit to the different agents)

• Intra Agent CAP (how to divide credit between different actions performed in an agent)

Agenda

• Introduction

• Summary

Learning and Activity Coordination

• Introduction

• Reinforcement Learning– Q-Learning and Learning Classifier Systems

• Isolated, Concurrent Reinforcement Learners

• Interactive Reinforcement Learning of Coordination– ACE and AGE

Introduction

• Activity Coordination• Adaption to to differences in the

coordination process• Effectively utilize opportunities and

avoidance of pitfalls.

Reinforcement Learning

• Optimise the feedback (reinforcement)• Modeled by a Markov decision process• <S, A, SxSxA,r>

Q-Learning

• When getting feedback=> update the Q-value

• Q(s,a) <- (1-b)Q(s,a)+b(R+y max(Q(s',a'))

• where b is a small constant called the learning rate

Learning Classifier Systems

• A classifier is (condition, action)• Strength of the classifier at a time S(c,a)• At each timestep a classifier is choosen from

a matchset ( according to environment)• Feedback is received and the S is modified

accordingly.

Isolated, Concurrent Reinforcement Learners

• Agent Coupling• Agent relationships• Feedback timing• Optimal behaviour combinations

• CIRL• No modelling of other agents• In cooperative situations, complimentary

policies can be developed• Adapts to similar situations.

Interactive Reinforcement Learning of Coordination

• Eliminates incompatible actions• Agents can observe the set of considered

actions of other agents.• Two different alternatives are ACE and AGE

Action Estimate Algorithm (ACE)

• Each agent calculates the set of performable actions• For each of these the agent calculates the goalrelevance.• For all agent with a GR above a treshold, the agents calc.

And announces a bid with a risk factor and a noise term :• B(S)= (a+b)E(S)• Removal of incompatible actions. It thereafter executes the

one with the highest bid.• The feedback increases the probability for succesful actions

to be performed in future.

Action Group Estimate Algorithm (AGE)

• All applicable actions from each agent is collected in to all possible activity contexts, in which all actions are mutually compatible.

• Using the same bidding strategy from ACE, the highest sum of bids for a activity context, chooses the activity context to execute.

• Credit assignment is dependent on the actions performed and the relevance of the action.

• Requires more computational effort than ACE.

Agenda

• Introduction

• Summary

Learning about and from other agents

• Introduction • Learning Organizational Roles• Learning in Market Environments

Introduction

• Learning to improve the individual performance

• On the expense of other agents

• Anticipatory Agents, RMM

·Learning Organizational Roles

• Learns roles, to better complement each other.

• Each agent can be in a set of roles (one at a time), and the choice is to choose the most appropriate role. (Minimise costs).

• f(U, P, C, Potential)

·Learning in Market Environments

• Agents sell/buy information from each other.

• 0-level agents do not model other agents• 1-level agents model other agents as 0-level

agents• 2-level agents model other agents as 1-level

agents

Agenda

• Introduction

• Summary

Learning and Communication

• Introduction• Reducing Communication by Learning• Improving Learning by Communication

Introduction

• Learning to communicate• Communicating as learning

• What to communicate?• When to communicate?• With whom to communicate?• How to communicate?

Reducing Communication by Learning

• Learning about the abilities of other agents.

• Learning which agents to ask, instead of broadcasting

• Problem similarities

Improving Learning by Communication

• Communicating beliefs and pieces of information

• Explanation

• Ontologies• Finding out complex relationships between

different agents and actions.

Agenda

• Introduction

• Summary

Summary

• We have seen the move of foci from isolated (individual, centralized) learning to a more diverse flora of learning.

• Besides standard (old) ML-methods there are some new ML-algorithms proposed.

• Agents learn to improve communication and cooperation.

Lecture about Agents that Learn 3rd April 2000 INT4/2I1235.

Documents

Transcript of Lecture about Agents that Learn 3rd April 2000 INT4/2I1235.

Update on Integrator 4 - itsiug.org.za 50 - INT4 Technology.pdf · Executive Technology Manager /Zadrik Welthagen : Zadrik.welthagen@adaptit.co.za Page 20. Title: PowerPoint Presentation

GENERAL DESCRIPTION: Ter Seeco - Biological Detectors · biological agents, all nerve agents, all blister agents, all blood agents, numerous choking agents, and toxic industrial materials

PostgreSQL in Research and Development Three Success …Mosaic_Source_Name Width Height ColCount RowCount Parameter Last_Mosaic_Id serial varchar(80) INT4 INT4 INT4 INT4 XML INT8

Accelerate Smart Solutions Development - A FIWARE Approach · FIWARE –Architecture Deployed IOT Networks Other Information Sources 3rd Party IOT Platforms NGSI IOT -Agents / NGSI

Agents 林柏均 695430059. Outline Introduction Agents UDP agents TCP agents.

3rd ANNUAL GENERAL MEETING - ChartNexus TIH AGM... · and travel agents Leverage on knowledge and experience gained from new market entrance / partnerships for future partnerships

02/09/2016 - WordPress.com · 11.Flavoring Agents. 12.Sweeting Agents. 13.Coloring Agents. 14.Solvent& Co-solvent. 15.Buffering Agents. 16.Chelating Agents. 17.Viscosity imparting

Convolutional Neural Network with INT4 Optimization on ... › support › documentation › white... · The output of the entire network is quantized into 8 bits. The innerproduct

Allergy and Antiallergic agents (antihistaminic agents)

Adrenal Agents. Women’s Health Agents. Men’s Health Agents.

International Engagement Strategy 2014-2017 · appropriate (INT3) 3.10 Links maintained with key international events in CCS and exploited as appropriate (INT4) 3.11 Help inform UK

Microbial agents -static agents - NAU

€¦ · Real Estate Agents and Brokers Cached Similar ... usana fraud 1 Google Home 23456 Search Help ... Deal Addict Join Date: Sep 3rd, 2002 Posts :

Etiology Of Cancer &Carcinogenic Agents. Carcinogenic Agents -Chemical Carcinogens -Physical Agents -Microbiological agents.

Stahl 3rd Ch10 Part1 Antipsycotic Agents

Chemotherapeutic agents (Part 2) antiplaque agents

Customer Service - FARO | 3D Measurement, Imaging & … · 2017-05-08 · Customer Service Team ... 6 Agents –1st Level Hotline 7 Agents –2nd Level Correspondence 7 Agents –3rd

Gastrointestinal Agents Acidifying agents and Antacids

Agents & Mobile Agents Introduction – Agents & Mobile Agents1.

Da Wish List 07/29/2017 - img.fireden.net · champions 3rd edition rules Chomp Chronicle System: ... Imperial Agents for 40k cold hands dark hearts ... Confrontation at Candlekeep