BotPrize 2014 Results. Human-Like Bots Competition at IEEE CIG

28
CIG 2014 - Human-Like Bots Competition IEEE Computational Intelligence in Games. Dortmund, Germany. August, 2014 Organization: Manuel G. Bedia Juan Peralta Joan Marc Philip Hingston Raúl Arrabales

description

Human-Like Bots Competition (BotPrize 2014) presented by Raul Arrabales at IEEE CIG - Computational Ingelligence and Artificial Intelligence in Games. The BotPrize is a Turing Test for First-Person Shooter video game bots.

Transcript of BotPrize 2014 Results. Human-Like Bots Competition at IEEE CIG

Page 1: BotPrize 2014 Results. Human-Like Bots Competition at IEEE CIG

CIG 2014 - Human-Like Bots CompetitionIEEE Computational Intelligence in Games. Dortmund, Germany. August, 2014

Organization: Manuel G. BediaJuan PeraltaJoan MarcPhilip HingstonRaúl Arrabales

Page 2: BotPrize 2014 Results. Human-Like Bots Competition at IEEE CIG

Organization / Acknowledgements

Page 3: BotPrize 2014 Results. Human-Like Bots Competition at IEEE CIG

Players (humans and bots) PLAYER TYPE TEAM MEMBERS AFFILIATION COUNTRY

BotTracker BOT TETRIIS Hunjoo LeeJee-Hyong Lee

ETRI, Sungkyunkwan

University

SOUTH KOREA

MirrorBot BOT IHSEV Mihai Polceanu ENIB CERV Centre de Réalité Virtuelle FRANCE

NizorBot BOT UMAG-BOT

José L. Jiménez LópezAntonio J. Fernández-Leiva

Antonio M. MoraUniversidad de

Málaga SPAIN

OvGUBot BOT OvGUBot Xenija NeufeldSanaz Mostaghim

Otto von Guericke University, Magdeburg

GERMANY

ADANN BOT CVC Juan Peralta DonateJoan Marc Llargués A. CVC. UAB SPAIN

CCBot BOT Conscious-Robots

Jorge Muñoz Raúl Arrabales Comaware SPAIN

Player HUMAN Judge - - -

Tmchojo HUMAN Judge - - -

Juan_CVC HUMAN Judge - - -

Xenija HUMAN Judge - - -

Page 4: BotPrize 2014 Results. Human-Like Bots Competition at IEEE CIG
Page 5: BotPrize 2014 Results. Human-Like Bots Competition at IEEE CIG

UT 2004 ServerArtificial Bots

Human Judges

Original BotPrize Testing Protocol(FPA – First-Person Assessment)

Real-time Online Anonymized interaction

Page 6: BotPrize 2014 Results. Human-Like Bots Competition at IEEE CIG

BotPrize 2014 Edition: We add TPA

* FPA – First-Person Assessment* TPA – Third-Person Assessment

UT 2004 ServerArtificial Bots

FPA Human Judges

Generation of Anonymized TPA Video Clips featuring human and bot players

TPA Judges(Crowdsourcing

platform)

Third-PersonCrowdsourcing

Judging

Page 7: BotPrize 2014 Results. Human-Like Bots Competition at IEEE CIG

BotPrize 2014 Edition: Humanness++ (H)

H = (FPA * FPWF) + (TPA * TPWF) FPWF First-Person Weighting Factor = 0,5. TPWF Third-Person Weighting Factor = 0,5.

UT 2004 ServerArtificial Bots

FPA Human Judges

H = FPA / 2 TPA / 2+

Page 8: BotPrize 2014 Results. Human-Like Bots Competition at IEEE CIG

Humanness scores based on SDT

Signal Detection Theory

Judge SDT Matrix Vote “Human” Vote “Bot”

Player is a Human Hit False Alarm

Player is a Bot Miss Hit

Tanner Jr., Wilson P.; John A. Swets (November 1954). "A decision-making theory of visual detection.". Psychological Review. 61 (6): 401–409.

Page 9: BotPrize 2014 Results. Human-Like Bots Competition at IEEE CIG

Humanness scores based on SDT

Judge Reliability (JR)

A measure of how good a judge is in terms of telling apart humans and bots

𝐽𝑅 𝑗=¿¿Judge SDT Matrix Vote “Human” Vote “Bot”

Player is a Human Hit False Alarm

Player is a Bot Miss Hit

Page 10: BotPrize 2014 Results. Human-Like Bots Competition at IEEE CIG

Judge Reliability can be used to adjust Humanness Scores

Judge Relative Reliability (JRR)

A measure of how good a judge is in relation with other judges

𝐽𝑅𝑅 𝑗=𝐽𝑅 𝑗

𝐴𝑣𝑔𝐽𝑅𝐴𝑣𝑔𝐽𝑅=∑𝑗=1

𝐽

( 𝐽𝑅 𝑗 )

𝐽

Page 11: BotPrize 2014 Results. Human-Like Bots Competition at IEEE CIG

Judge Reliability can be used to adjust Humanness Scores

Judge Relative Reliability (JRR)

A measure of how good a judge is in relation with other judges

judges JRmeasures JRR Weight Player 0.28967254 1.2574682 0.31436704 tmchojo 0.43801653 1.9014292 0.47535731 Juan_CVC 0.16129032 0.7001611 0.17504027 Xenija 0.03246753 0.1409415 0.03523538

Page 12: BotPrize 2014 Results. Human-Like Bots Competition at IEEE CIG

Judge Reliability can be used to adjust Humanness

Scores

Judge Relative Reliability (JRR)“tmchojo” is the best FPA judge

44% Correct Guesses

Page 13: BotPrize 2014 Results. Human-Like Bots Competition at IEEE CIG

Bots Judging Reliability

“BotTracker” is the best Bot telling apart bots and humans

(32%)

Page 14: BotPrize 2014 Results. Human-Like Bots Competition at IEEE CIG

Humans & Bots Judging Reliability

BH H H HB B B B B

Page 15: BotPrize 2014 Results. Human-Like Bots Competition at IEEE CIG

Humans & Bots Judging Reliability

Page 16: BotPrize 2014 Results. Human-Like Bots Competition at IEEE CIG

Judge Reliability can be used to adjust Humanness Scores

JRmeasures["Weight"] <- JRmeasures$JRR / nrow(JRmeasures)

Page 17: BotPrize 2014 Results. Human-Like Bots Competition at IEEE CIG

Calculating FPA (First-Person Assessment)

Weighted First-Person Humanness Ratio

𝐹𝑃𝐴𝑖=∑𝑗=1

𝑛

( h𝑊𝑒𝑖𝑔 𝑡 𝑗∗𝐻𝑢𝑚𝑎𝑛𝑛𝑒𝑠𝑠𝑖 , 𝑗 )

𝐽

h𝑢𝑚𝑎𝑛𝑛𝑒𝑠𝑠𝑖 , 𝑗=𝑀𝑖𝑠𝑠 𝑖 , 𝑗𝑁 𝑖 , 𝑗

Judge j SDT Matrix Voted “Human” Voted “Bot”

Player is a Human Hit False Alarm

Player is a Bot Miss Hit

Humanness of player iaccording to Judge j

Sample proportion is an unbiased estimator of p

in the population.

Page 18: BotPrize 2014 Results. Human-Like Bots Competition at IEEE CIG

Calculating FPA (First-Person Assessment)

Weighted First-Person Humanness Ratio

BotName Humanness FPAMirrorBot 0.4996406 0.20164771BotTracker 0.4231043 0.20070203OvGUBot 0.3164826 0.10545765NizorBot 0.2980527 0.11821633ADANN 0.2432864 0.08351664CCBot 0.1685606 0.06214746

BotName Humanness FPAPlayer 0.5417464 0.1932813tmchojo 0.5177169 0.1775752Xenija 0.3847691 0.1713976Juan_CVC 0.3172348 0.1237229

Page 19: BotPrize 2014 Results. Human-Like Bots Competition at IEEE CIG

Calculating FPA (First-Person Assessment)

Weighted First-Person Humanness Ratio

Page 20: BotPrize 2014 Results. Human-Like Bots Competition at IEEE CIG

Calculating FPA (First-Person Assessment)

Weighted First-Person Humanness Ratio

H H H HB B B B B B

Page 21: BotPrize 2014 Results. Human-Like Bots Competition at IEEE CIG

Calculating FPA (First-Person Assessment)

Weighted First-Person Humanness Ratio

H H H HBB B B B B

Page 22: BotPrize 2014 Results. Human-Like Bots Competition at IEEE CIG

BotPrize 2014 Edition: We add TPA

* TPA – Third-Person Assessment

UT 2004 ServerArtificial Bots

FPA Human Judges

Generation of Anonymized TPA Video Clips featuring human and bot players

TPA Judges(Crowdsourcing

platform)

Third-PersonCrowdsourcing

Judging

Page 23: BotPrize 2014 Results. Human-Like Bots Competition at IEEE CIG

Calculating TPA (Third-Person Assessment)

Crowdsourcing Judging

𝑇𝑃𝐴𝑖 , 𝑗=𝑀𝑖𝑠𝑠𝑖 , 𝑗𝑁 𝑖 , 𝑗

J = 232 human judgesI = 12 characters (6+6)

Page 24: BotPrize 2014 Results. Human-Like Bots Competition at IEEE CIG

Calculating TPA (Third-Person Assessment)

Crowdsourcing Judging

BotName FPA TPA H++Xenija 0.17139763 0.8235294 0.4974635MirrorBot 0.20164771 0.7333333 0.4674905Player 0.19328127 0.6315789 0.4124301tmchojo 0.17757519 0.6470588 0.4123170NizorBot 0.11821633 0.7058824 0.4120493BotTracker 0.20070203 0.5909091 0.3958056CCBot 0.06214746 0.7058824 0.3840149Juan_CVC 0.12372294 0.6190476 0.3713853OvGUBot 0.10545765 0.6086957 0.3570767ADANN 0.08351664 0.4761905 0.2798536

Page 25: BotPrize 2014 Results. Human-Like Bots Competition at IEEE CIG

Calculating TPA (Third-Person Assessment)

Crowdsourcing Judging

H H H HB B B B B BH H

Page 26: BotPrize 2014 Results. Human-Like Bots Competition at IEEE CIG

Final Results (FPA + TPA)

H H HB B B BB BH

Page 27: BotPrize 2014 Results. Human-Like Bots Competition at IEEE CIG

Final Results (H++)

MirrorBot

NizorBot

BotTracker

OvGUBot

0.467

0.412

0.395

0.357

Mihai Polceanu

José L. Jiménez LópezAntonio J. Fernández-Leiva

Antonio M. Mora

Hunjoo LeeJee-Hyong Lee

Xenija NeufeldSanaz Mostaghim

Page 28: BotPrize 2014 Results. Human-Like Bots Competition at IEEE CIG

Congratulations for your results!!!Hope to see you again

next year

www.botprize.org human-machine.unizar.es

[email protected]

@ConsciousRobots