Riza Erdem Jappie Klooster Dirk Meulenbelt EVOLVING MULTI-MODAL BEHAVIOR IN NPC S.

Riza ErdemJappie KloosterDirk Meulenbelt

EVOLVING MULTI-MODAL BEHAVIOR IN

NPCS

Authors

Jacob Schrum Risto Miikkulainen

Youtube channel: https://www.youtube.com/channel/UCCKhH1p0tj1frvcD70tEyDg

Times cited: 18

PAPER

https://www.youtube.com/channel/UCCKhH1p0tj1frvcD70tEyDg

https://www.youtube.com/channel/UCCKhH1p0tj1frvcD70tEyDg

NPCs

Multi-modal Behavior

Downsides

So;they want to propose a method that encourages the development of this multi-modal behavior.

GOAL

Neuroevolution?

Why neuroevolution?

Improving the neuroevolution method

NEUROEVOLUTION

Player, NPC team, healthpoints, and a bat

Fight game Flight game Example footage: http://nn.cs.utexas.edu/?multimodal09

What would be beneficial behavior (four objectives)? Fight Maximize damage dealt Fight Minimize damage received Fight Maximize time alive Flight Maximize damage dealth

What’s useful about this game?

MOEAs

FIGHT OR FLIGHT GAME

Solutions consist of neural networks

FS-Neat (In this paper) Copy existing solutions Mutate the copies Do competition Apply selection Rinse and repeat

APPLICATION OF NEUROEVOLUTION

Neural network

Input nodes

Output nodes

NEURAL NETWORKS

FS-NEAT

Input nodes

Output nodes

Input nodes

Output nodes

ADD NODE MUTATION

Input nodes

Output nodes

Input nodes

Output nodes

ADD LINK MUTATION

Input nodes

Output nodes

UNIQUE: MERGE MUTATION

Input nodes

Output nodes

Input nodes

Output nodes

UNIQUE: TWO TYPES OF OUTPUT NODES

Neural network

Input nodes

Output nodes

Policy nodes

Preference node

UNIQUE: ADD OUTPUT NODES MUTATION!

Input nodes

Output nodes

Input nodes

Output nodes

MULTI-OBJECTIVE EVOLUTION (SELECTION)

Domination

Pareto front

NSGA-II

NSGA-II

Next generation

NSGA-II

Next generation

PARETO FRONT

DOESN’T F IT

DROP OBJECTIVE / PARETO DIMENSION

NEW PARETO FRONT

TRY AGAIN

Next generation

PARETO FRONT DOES

F IT !

PRIORITIES FOR GOAL DROPPING

Fight-DamageDealt Flight-DamageDealt Fight-DamageReceived Fight-TimeAlive

In ascending order

Goal disabling

SELECTION OVERVIEW

Non dominated solutions are inserted in next generation and removed from consideration

Repeat untill next gen is full

Cutoff is often reached NSGA-II uses “crowding distance” This paper re-sorts while dropping objectives

Goals are sometimes temporarly disabled when all members of the population have reached them and renabled when the population starts performing poor on them.

When the final dimension is under consideration and equal score is achieved members are selected by random.

LET’S SEE HOW IT ALL WORKS OUTExperimental approach parameters:

Evolved in Homogeneous teams: Predictable team-mates

Fitness score for group as whole encourages teamwork, like a suicide bomber tactic, killing one

but increasing teamscore.

Neuroevolution combined with NSGA-II as outlined before is used to evolve the NPCs on the goals.

A single parent population of 50 neural networks

Evaluation: 5 Fights and 5 Flights. Because evaluation on just one is noisy. Take avg score. This number (5/5) is a decent tradeoff (time vs. computation)

BOTSExperimental trials are battled out against another

NPC, henceforth called a BOT. Because it’s way too boring for normal people.

These bots are armed with two clever strategies for the tasks

Fight: Go right out swinging hard against the closest enemy. This

worksFlight:

Run away backwards so it can keep moving away from its closest attacker easily when attacked, because you can see it.

These tactics are pretty tough for the computer to beat, so the bot starts at 0% speed (only able to turn), increasing to 40/80/100%

GOALSPer objective the NPCs are measured on the previously

outlined goals. The targets are as follows:

Fight1. Maximise DMG dealt: 502. Minimise DMG taken: -203. Maximise time alive: 80% of the trial

Flight1. Maximise DMG dealt: 100

These goals are represented as numbers and the values should be attained by an average NPC. So, teamwork can help.

Once a network achieves all goals, the bot speed increases.

TWO CONDITIONS1MODE versus ModeMutation

1MODE Neural networks with a single output mode containing two nodes, and no preference node (because just one mode).

ModeMutation starts as a single output mode, but can gain more through mutation. Starts with three output nodes for a single mode network (two policy and one preference node).

Each condition is evaluated in 10 separate trials for 300 generations or until all goals are achieved at 100% bot speed, whichever comes first. The bot speed starts at 0% and goes up from there.

RESULTS ModeMutation performs twice as well as 1MODE.

ModeMutation beats the bot on all fronts on 100% speed 4 out of 10 trials whereas 1MODE only manages 2.

ModeMutation is also more successful at finding good strategies for each task (Fight or Flight). 1MODE is successful at finding very good strategies on some tasks, but very bad on others.

Example of a good Fight strategy: Closest NPC moves away from bot, disallowing it to catch up, others

follow. But then the closest turns slightly to the left, causing bot to turn to hit it, take some dmg, but the others catch up and mess up the bot real bad.

Example for Flight: Keep knocking the bot towards the center so it keeps being surrounded

while getting his ass kicked.

ModeMutation found both of these strategies!

DISCUSSIONModeMutation develops good strategies across the

board, on all goals. These stategies aren’t necessarily clear from evaluations in general, for they are mostly averages.

1MODE proved more likely to develop strong strategies on one front but will suffer for it on others.

SO: we find that ModeMutation is better at finding multi-modal strategies (which we were looking for), whereas 1Mode is more of an idiot savant.

This has an interesting implication for games: we don’t want NPCs that are crazy good one some occasions but run around like retards on others. This makes multi-modal behaviour much more interesting.

CONCLUSIONA mutation operator that adds new modes by adding

nodes to the output layer of a neural network in neuroevolution is a promising way to develop multi-modal behaviour in NPCs.

Thanks for watching. Questions?

Riza Erdem Jappie Klooster Dirk Meulenbelt EVOLVING MULTI-MODAL BEHAVIOR IN NPC S.

Documents

Transcript of Riza Erdem Jappie Klooster Dirk Meulenbelt EVOLVING MULTI-MODAL BEHAVIOR IN NPC S.