Riza Erdem Jappie Klooster Dirk Meulenbelt EVOLVING MULTI-MODAL BEHAVIOR IN NPC S.
-
Upload
eric-maxwell -
Category
Documents
-
view
218 -
download
0
description
Transcript of Riza Erdem Jappie Klooster Dirk Meulenbelt EVOLVING MULTI-MODAL BEHAVIOR IN NPC S.
Riza ErdemJappie KloosterDirk Meulenbelt
EVOLVING MULTI-MODAL BEHAVIOR IN
NPCS
Authors
Jacob Schrum Risto Miikkulainen
Youtube channel: https://www.youtube.com/channel/UCCKhH1p0tj1frvcD70tEyDg
Times cited: 18
PAPER
NPCs
Multi-modal Behavior
Downsides
So;they want to propose a method that encourages the development of this multi-modal behavior.
GOAL
Neuroevolution?
Why neuroevolution?
Improving the neuroevolution method
NEUROEVOLUTION
Player, NPC team, healthpoints, and a bat
Fight game Flight game Example footage: http://nn.cs.utexas.edu/?multimodal09
What would be beneficial behavior (four objectives)? Fight Maximize damage dealt Fight Minimize damage received Fight Maximize time alive Flight Maximize damage dealth
What’s useful about this game?
MOEAs
FIGHT OR FLIGHT GAME
Solutions consist of neural networks
FS-Neat (In this paper) Copy existing solutions Mutate the copies Do competition Apply selection Rinse and repeat
APPLICATION OF NEUROEVOLUTION
Neural network
Input nodes
Output nodes
NEURAL NETWORKS
FS-NEAT
Input nodes
Output nodes
Input nodes
Output nodes
ADD NODE MUTATION
Input nodes
Output nodes
Input nodes
Output nodes
ADD LINK MUTATION
Input nodes
Output nodes
UNIQUE: MERGE MUTATION
Input nodes
Output nodes
Input nodes
Output nodes
UNIQUE: TWO TYPES OF OUTPUT NODES
Neural network
Input nodes
Output nodes
Policy nodes
Preference node
UNIQUE: ADD OUTPUT NODES MUTATION!
Input nodes
Output nodes
Input nodes
Output nodes
MULTI-OBJECTIVE EVOLUTION (SELECTION)
Domination
Pareto front
NSGA-II
NSGA-II
Next generation
NSGA-II
Next generation
PARETO FRONT
DOESN’T F IT
DROP OBJECTIVE / PARETO DIMENSION
NEW PARETO FRONT
TRY AGAIN
Next generation
PARETO FRONT DOES
F IT !
PRIORITIES FOR GOAL DROPPING
Fight-DamageDealt Flight-DamageDealt Fight-DamageReceived Fight-TimeAlive
In ascending order
Goal disabling
SELECTION OVERVIEW
Non dominated solutions are inserted in next generation and removed from consideration
Repeat untill next gen is full
Cutoff is often reached NSGA-II uses “crowding distance” This paper re-sorts while dropping objectives
Goals are sometimes temporarly disabled when all members of the population have reached them and renabled when the population starts performing poor on them.
When the final dimension is under consideration and equal score is achieved members are selected by random.
LET’S SEE HOW IT ALL WORKS OUTExperimental approach parameters:
Evolved in Homogeneous teams: Predictable team-mates
Fitness score for group as whole encourages teamwork, like a suicide bomber tactic, killing one
but increasing teamscore.
Neuroevolution combined with NSGA-II as outlined before is used to evolve the NPCs on the goals.
A single parent population of 50 neural networks
Evaluation: 5 Fights and 5 Flights. Because evaluation on just one is noisy. Take avg score. This number (5/5) is a decent tradeoff (time vs. computation)
BOTSExperimental trials are battled out against another
NPC, henceforth called a BOT. Because it’s way too boring for normal people.
These bots are armed with two clever strategies for the tasks
Fight: Go right out swinging hard against the closest enemy. This
worksFlight:
Run away backwards so it can keep moving away from its closest attacker easily when attacked, because you can see it.
These tactics are pretty tough for the computer to beat, so the bot starts at 0% speed (only able to turn), increasing to 40/80/100%
GOALSPer objective the NPCs are measured on the previously
outlined goals. The targets are as follows:
Fight1. Maximise DMG dealt: 502. Minimise DMG taken: -203. Maximise time alive: 80% of the trial
Flight1. Maximise DMG dealt: 100
These goals are represented as numbers and the values should be attained by an average NPC. So, teamwork can help.
Once a network achieves all goals, the bot speed increases.
TWO CONDITIONS1MODE versus ModeMutation
1MODE Neural networks with a single output mode containing two nodes, and no preference node (because just one mode).
ModeMutation starts as a single output mode, but can gain more through mutation. Starts with three output nodes for a single mode network (two policy and one preference node).
Each condition is evaluated in 10 separate trials for 300 generations or until all goals are achieved at 100% bot speed, whichever comes first. The bot speed starts at 0% and goes up from there.
RESULTS ModeMutation performs twice as well as 1MODE.
ModeMutation beats the bot on all fronts on 100% speed 4 out of 10 trials whereas 1MODE only manages 2.
ModeMutation is also more successful at finding good strategies for each task (Fight or Flight). 1MODE is successful at finding very good strategies on some tasks, but very bad on others.
Example of a good Fight strategy: Closest NPC moves away from bot, disallowing it to catch up, others
follow. But then the closest turns slightly to the left, causing bot to turn to hit it, take some dmg, but the others catch up and mess up the bot real bad.
Example for Flight: Keep knocking the bot towards the center so it keeps being surrounded
while getting his ass kicked.
ModeMutation found both of these strategies!
DISCUSSIONModeMutation develops good strategies across the
board, on all goals. These stategies aren’t necessarily clear from evaluations in general, for they are mostly averages.
1MODE proved more likely to develop strong strategies on one front but will suffer for it on others.
SO: we find that ModeMutation is better at finding multi-modal strategies (which we were looking for), whereas 1Mode is more of an idiot savant.
This has an interesting implication for games: we don’t want NPCs that are crazy good one some occasions but run around like retards on others. This makes multi-modal behaviour much more interesting.
CONCLUSIONA mutation operator that adds new modes by adding
nodes to the output layer of a neural network in neuroevolution is a promising way to develop multi-modal behaviour in NPCs.
Thanks for watching. Questions?