Riza Erdem Jappie Klooster Dirk Meulenbelt EVOLVING MULTI-MODAL BEHAVIOR IN NPC S.

27
Riza Erdem Jappie Klooster Dirk Meulenbelt EVOLVING MULTI- MODAL BEHAVIOR IN NPC S

description

 NPCs  Multi-modal Behavior  Downsides  So; they want to propose a method that encourages the development of this multi-modal behavior. GOAL

Transcript of Riza Erdem Jappie Klooster Dirk Meulenbelt EVOLVING MULTI-MODAL BEHAVIOR IN NPC S.

Page 1: Riza Erdem Jappie Klooster Dirk Meulenbelt EVOLVING MULTI-MODAL BEHAVIOR IN NPC S.

Riza ErdemJappie KloosterDirk Meulenbelt

EVOLVING MULTI-MODAL BEHAVIOR IN

NPCS

Page 2: Riza Erdem Jappie Klooster Dirk Meulenbelt EVOLVING MULTI-MODAL BEHAVIOR IN NPC S.

Authors

Jacob Schrum Risto Miikkulainen

Youtube channel: https://www.youtube.com/channel/UCCKhH1p0tj1frvcD70tEyDg

Times cited: 18

PAPER

Page 3: Riza Erdem Jappie Klooster Dirk Meulenbelt EVOLVING MULTI-MODAL BEHAVIOR IN NPC S.

NPCs

Multi-modal Behavior

Downsides

So;they want to propose a method that encourages the development of this multi-modal behavior.

GOAL

Page 4: Riza Erdem Jappie Klooster Dirk Meulenbelt EVOLVING MULTI-MODAL BEHAVIOR IN NPC S.

Neuroevolution?

Why neuroevolution?

Improving the neuroevolution method

NEUROEVOLUTION

Page 5: Riza Erdem Jappie Klooster Dirk Meulenbelt EVOLVING MULTI-MODAL BEHAVIOR IN NPC S.

Player, NPC team, healthpoints, and a bat

Fight game Flight game Example footage: http://nn.cs.utexas.edu/?multimodal09

What would be beneficial behavior (four objectives)? Fight Maximize damage dealt Fight Minimize damage received Fight Maximize time alive Flight Maximize damage dealth

What’s useful about this game?

MOEAs

FIGHT OR FLIGHT GAME

Page 6: Riza Erdem Jappie Klooster Dirk Meulenbelt EVOLVING MULTI-MODAL BEHAVIOR IN NPC S.

Solutions consist of neural networks

FS-Neat (In this paper) Copy existing solutions Mutate the copies Do competition Apply selection Rinse and repeat

APPLICATION OF NEUROEVOLUTION

Page 7: Riza Erdem Jappie Klooster Dirk Meulenbelt EVOLVING MULTI-MODAL BEHAVIOR IN NPC S.

Neural network

Input nodes

Output nodes

NEURAL NETWORKS

Page 8: Riza Erdem Jappie Klooster Dirk Meulenbelt EVOLVING MULTI-MODAL BEHAVIOR IN NPC S.

FS-NEAT

Input nodes

Output nodes

Page 9: Riza Erdem Jappie Klooster Dirk Meulenbelt EVOLVING MULTI-MODAL BEHAVIOR IN NPC S.

Input nodes

Output nodes

ADD NODE MUTATION

Input nodes

Output nodes

Page 10: Riza Erdem Jappie Klooster Dirk Meulenbelt EVOLVING MULTI-MODAL BEHAVIOR IN NPC S.

Input nodes

Output nodes

ADD LINK MUTATION

Input nodes

Output nodes

Page 11: Riza Erdem Jappie Klooster Dirk Meulenbelt EVOLVING MULTI-MODAL BEHAVIOR IN NPC S.

UNIQUE: MERGE MUTATION

Input nodes

Output nodes

Input nodes

Output nodes

Page 12: Riza Erdem Jappie Klooster Dirk Meulenbelt EVOLVING MULTI-MODAL BEHAVIOR IN NPC S.

UNIQUE: TWO TYPES OF OUTPUT NODES

Neural network

Input nodes

Output nodes

Policy nodes

Preference node

Page 13: Riza Erdem Jappie Klooster Dirk Meulenbelt EVOLVING MULTI-MODAL BEHAVIOR IN NPC S.

UNIQUE: ADD OUTPUT NODES MUTATION!

Input nodes

Output nodes

Input nodes

Output nodes

Page 14: Riza Erdem Jappie Klooster Dirk Meulenbelt EVOLVING MULTI-MODAL BEHAVIOR IN NPC S.

MULTI-OBJECTIVE EVOLUTION (SELECTION)

Domination

Pareto front

NSGA-II

Page 15: Riza Erdem Jappie Klooster Dirk Meulenbelt EVOLVING MULTI-MODAL BEHAVIOR IN NPC S.

NSGA-II

Next generation

Page 16: Riza Erdem Jappie Klooster Dirk Meulenbelt EVOLVING MULTI-MODAL BEHAVIOR IN NPC S.

NSGA-II

Next generation

PARETO FRONT

DOESN’T F IT

Page 17: Riza Erdem Jappie Klooster Dirk Meulenbelt EVOLVING MULTI-MODAL BEHAVIOR IN NPC S.

DROP OBJECTIVE / PARETO DIMENSION

NEW PARETO FRONT

Page 18: Riza Erdem Jappie Klooster Dirk Meulenbelt EVOLVING MULTI-MODAL BEHAVIOR IN NPC S.

TRY AGAIN

Next generation

PARETO FRONT DOES

F IT !

Page 19: Riza Erdem Jappie Klooster Dirk Meulenbelt EVOLVING MULTI-MODAL BEHAVIOR IN NPC S.

PRIORITIES FOR GOAL DROPPING

Fight-DamageDealt Flight-DamageDealt Fight-DamageReceived Fight-TimeAlive

In ascending order

Goal disabling

Page 20: Riza Erdem Jappie Klooster Dirk Meulenbelt EVOLVING MULTI-MODAL BEHAVIOR IN NPC S.

SELECTION OVERVIEW

Non dominated solutions are inserted in next generation and removed from consideration

Repeat untill next gen is full

Cutoff is often reached NSGA-II uses “crowding distance” This paper re-sorts while dropping objectives

Goals are sometimes temporarly disabled when all members of the population have reached them and renabled when the population starts performing poor on them.

When the final dimension is under consideration and equal score is achieved members are selected by random.

Page 21: Riza Erdem Jappie Klooster Dirk Meulenbelt EVOLVING MULTI-MODAL BEHAVIOR IN NPC S.

LET’S SEE HOW IT ALL WORKS OUTExperimental approach parameters:

Evolved in Homogeneous teams: Predictable team-mates

Fitness score for group as whole encourages teamwork, like a suicide bomber tactic, killing one

but increasing teamscore.

Neuroevolution combined with NSGA-II as outlined before is used to evolve the NPCs on the goals.

A single parent population of 50 neural networks

Evaluation: 5 Fights and 5 Flights. Because evaluation on just one is noisy. Take avg score. This number (5/5) is a decent tradeoff (time vs. computation)

Page 22: Riza Erdem Jappie Klooster Dirk Meulenbelt EVOLVING MULTI-MODAL BEHAVIOR IN NPC S.

BOTSExperimental trials are battled out against another

NPC, henceforth called a BOT. Because it’s way too boring for normal people.

These bots are armed with two clever strategies for the tasks

Fight: Go right out swinging hard against the closest enemy. This

worksFlight:

Run away backwards so it can keep moving away from its closest attacker easily when attacked, because you can see it.

These tactics are pretty tough for the computer to beat, so the bot starts at 0% speed (only able to turn), increasing to 40/80/100%

Page 23: Riza Erdem Jappie Klooster Dirk Meulenbelt EVOLVING MULTI-MODAL BEHAVIOR IN NPC S.

GOALSPer objective the NPCs are measured on the previously

outlined goals. The targets are as follows:

Fight1. Maximise DMG dealt: 502. Minimise DMG taken: -203. Maximise time alive: 80% of the trial

Flight1. Maximise DMG dealt: 100

These goals are represented as numbers and the values should be attained by an average NPC. So, teamwork can help.

Once a network achieves all goals, the bot speed increases.

Page 24: Riza Erdem Jappie Klooster Dirk Meulenbelt EVOLVING MULTI-MODAL BEHAVIOR IN NPC S.

TWO CONDITIONS1MODE versus ModeMutation

1MODE Neural networks with a single output mode containing two nodes, and no preference node (because just one mode).

ModeMutation starts as a single output mode, but can gain more through mutation. Starts with three output nodes for a single mode network (two policy and one preference node).

Each condition is evaluated in 10 separate trials for 300 generations or until all goals are achieved at 100% bot speed, whichever comes first. The bot speed starts at 0% and goes up from there.

Page 25: Riza Erdem Jappie Klooster Dirk Meulenbelt EVOLVING MULTI-MODAL BEHAVIOR IN NPC S.

RESULTS ModeMutation performs twice as well as 1MODE.

ModeMutation beats the bot on all fronts on 100% speed 4 out of 10 trials whereas 1MODE only manages 2.

ModeMutation is also more successful at finding good strategies for each task (Fight or Flight). 1MODE is successful at finding very good strategies on some tasks, but very bad on others.

Example of a good Fight strategy: Closest NPC moves away from bot, disallowing it to catch up, others

follow. But then the closest turns slightly to the left, causing bot to turn to hit it, take some dmg, but the others catch up and mess up the bot real bad.

Example for Flight: Keep knocking the bot towards the center so it keeps being surrounded

while getting his ass kicked.

ModeMutation found both of these strategies!

Page 26: Riza Erdem Jappie Klooster Dirk Meulenbelt EVOLVING MULTI-MODAL BEHAVIOR IN NPC S.

DISCUSSIONModeMutation develops good strategies across the

board, on all goals. These stategies aren’t necessarily clear from evaluations in general, for they are mostly averages.

1MODE proved more likely to develop strong strategies on one front but will suffer for it on others.

SO: we find that ModeMutation is better at finding multi-modal strategies (which we were looking for), whereas 1Mode is more of an idiot savant.

This has an interesting implication for games: we don’t want NPCs that are crazy good one some occasions but run around like retards on others. This makes multi-modal behaviour much more interesting.

Page 27: Riza Erdem Jappie Klooster Dirk Meulenbelt EVOLVING MULTI-MODAL BEHAVIOR IN NPC S.

CONCLUSIONA mutation operator that adds new modes by adding

nodes to the output layer of a neural network in neuroevolution is a promising way to develop multi-modal behaviour in NPCs.

Thanks for watching. Questions?