Large-Scale Platform for MOBA Game AI - NVIDIA · • [1] Silver, David, et al. "Mastering the game...

55
Large-Scale Platform for MOBA Game AI 28 th March 2018 Bin Wu & Qiang Fu

Transcript of Large-Scale Platform for MOBA Game AI - NVIDIA · • [1] Silver, David, et al. "Mastering the game...

Page 1: Large-Scale Platform for MOBA Game AI - NVIDIA · • [1] Silver, David, et al. "Mastering the game of Go with deep neural networks and tree search." Nature 529.7587 (2016): 484-489.

Large-Scale Platform for MOBA Game AI

28th March 2018

Bin Wu & Qiang Fu

Page 2: Large-Scale Platform for MOBA Game AI - NVIDIA · • [1] Silver, David, et al. "Mastering the game of Go with deep neural networks and tree search." Nature 529.7587 (2016): 484-489.

Outline

• Introduction

• Learning algorithms

• Computing platform

• Demonstration

Page 3: Large-Scale Platform for MOBA Game AI - NVIDIA · • [1] Silver, David, et al. "Mastering the game of Go with deep neural networks and tree search." Nature 529.7587 (2016): 484-489.

Game AI Development

Early exploration Transition Rapid development Explosive growth

1950s-1960s 1970s-1980s 1990s-2000s 2010s

Checkers beat state

champion

Chess4.5 beat human

players Deep Blue (IBM) beat

Garry Kasparov

Alpha Go (DeepMind)

defeat Lee Sedol, and

Jie Ke

Page 4: Large-Scale Platform for MOBA Game AI - NVIDIA · • [1] Silver, David, et al. "Mastering the game of Go with deep neural networks and tree search." Nature 529.7587 (2016): 484-489.

Applications of Game AI

Research Gaming

Ideal testbed for general AI research Core applications in gaming industry

Massive data from human players

Low experimental costs

General ability for inception and decision

Pre-game procedures e.g., game designing

Player experience e.g., teammates, enemies

Others e.g., E-sports From virtual world to

real world

Page 5: Large-Scale Platform for MOBA Game AI - NVIDIA · • [1] Silver, David, et al. "Mastering the game of Go with deep neural networks and tree search." Nature 529.7587 (2016): 484-489.

Game AI Research Topic

• Many AI giants have joined game AI research

• Moving from Go->RTS, MOBA, etc.

Game AI has become a research hot topic after the success of AlphaGo

Released Starcraft

AI platform,preliminary results in simple scenarios

Released Starcraft II AI platform,not

able to defeat built-in AI

DOTA 2 1v1 beat top human players. 5v5 to be activated in 2018

Page 6: Large-Scale Platform for MOBA Game AI - NVIDIA · • [1] Silver, David, et al. "Mastering the game of Go with deep neural networks and tree search." Nature 529.7587 (2016): 484-489.

MOBA Game

• 5 v.s. 5 game Obtain gold/exp → gain advantages on equipment → win fights → destroy enemy’s base

Goal: Destroy Enemy’s Base

Enemy’s Turrets

Movement Control

Attack/Skills Control

Equipment Purchase

Neutral creeps: source of

money/power/level/…

Page 7: Large-Scale Platform for MOBA Game AI - NVIDIA · • [1] Silver, David, et al. "Mastering the game of Go with deep neural networks and tree search." Nature 529.7587 (2016): 484-489.

MOBA Game

• Micro Combat ◇ Movement

◇ Use of skills

Page 8: Large-Scale Platform for MOBA Game AI - NVIDIA · • [1] Silver, David, et al. "Mastering the game of Go with deep neural networks and tree search." Nature 529.7587 (2016): 484-489.

MOBA Game

• Macro Strategies ◇ Back up

◇ Laning

◇ Ganking

◇ Stealing base

Page 9: Large-Scale Platform for MOBA Game AI - NVIDIA · • [1] Silver, David, et al. "Mastering the game of Go with deep neural networks and tree search." Nature 529.7587 (2016): 484-489.

MOBA AI - Key Challenge

Computing Platform Learning Algorithm

Page 10: Large-Scale Platform for MOBA Game AI - NVIDIA · • [1] Silver, David, et al. "Mastering the game of Go with deep neural networks and tree search." Nature 529.7587 (2016): 484-489.

Learning Algorithms

Page 11: Large-Scale Platform for MOBA Game AI - NVIDIA · • [1] Silver, David, et al. "Mastering the game of Go with deep neural networks and tree search." Nature 529.7587 (2016): 484-489.

Learning Algorithms - Challenges

1

2

3

4

Complexity 10^20000

Multi-agent 5v5 coordination

Imperfect Info Partially observable

Sparse and delayed rewards 20,000+ frames per game

Page 12: Large-Scale Platform for MOBA Game AI - NVIDIA · • [1] Silver, David, et al. "Mastering the game of Go with deep neural networks and tree search." Nature 529.7587 (2016): 484-489.

Learning Algorithms - Challenges

• Complexity >> Go ◇ End to end solutions (SL/RL) do not work well

Not able to finish basic movement/attack

Similar observations made

• DeepMind

Go MOBA

State space 3^360 ≈ 10^170

(361 pos, 3 states each) 10^20000

(10 heroes,2000+pos * 10+states)

Action space

250^150 ≈ 10^360 (250 pos available, 150

decisions per game in average)

20^20000 ≈ 10^20000 (20 actions,20,000 frames per game)

Left, right, … Skill 1,2,3+pos/target

recover return etc…

Page 13: Large-Scale Platform for MOBA Game AI - NVIDIA · • [1] Silver, David, et al. "Mastering the game of Go with deep neural networks and tree search." Nature 529.7587 (2016): 484-489.

Learning Algorithms - Challenges

• Multi-agent ◇ Macro strategy level

Four defending, while one steals the base

◇ Micro combat level Tanks protecting assassins

Page 14: Large-Scale Platform for MOBA Game AI - NVIDIA · • [1] Silver, David, et al. "Mastering the game of Go with deep neural networks and tree search." Nature 529.7587 (2016): 484-489.

Learning Algorithms - Challenges

• Sparse and delayed rewards

Go

MOBA

>20,000 steps

<360 steps

Page 15: Large-Scale Platform for MOBA Game AI - NVIDIA · • [1] Silver, David, et al. "Mastering the game of Go with deep neural networks and tree search." Nature 529.7587 (2016): 484-489.

Learning Algorithms - Challenges

• Imperfect Information ◇ Maps are partially observable

Guess enemy’s positions/strategy

Actively explore to gain vision

Page 16: Large-Scale Platform for MOBA Game AI - NVIDIA · • [1] Silver, David, et al. "Mastering the game of Go with deep neural networks and tree search." Nature 529.7587 (2016): 484-489.

• Divide and Conquer

Model Architecture

Transfer

Combat

Split for simplification Solution space ~10^20000->~10^2000

Strategy

Page 17: Large-Scale Platform for MOBA Game AI - NVIDIA · • [1] Silver, David, et al. "Mastering the game of Go with deep neural networks and tree search." Nature 529.7587 (2016): 484-489.

Model - Transfer

• Where to send heroes? ◇ Compared to Go game

Put heroes as stones

Put maps as boards

◇ Predict good position

Hotspots Prediction Transfer

Page 18: Large-Scale Platform for MOBA Game AI - NVIDIA · • [1] Silver, David, et al. "Mastering the game of Go with deep neural networks and tree search." Nature 529.7587 (2016): 484-489.

Model - Strategy

• Key resources in MOBA ◇ Modeling macro objectives Describe hotspots transition series before destroying the key resource

Page 19: Large-Scale Platform for MOBA Game AI - NVIDIA · • [1] Silver, David, et al. "Mastering the game of Go with deep neural networks and tree search." Nature 529.7587 (2016): 484-489.

Model - Strategy

宏观Session切分示例

Start Dragon Mid 1st turret

Slain Dark

dragon Dragon

Base Mid 3rd turret

Mid 2nd turret Bottom 1st

turret

Stealing blue creep Killing bottom lane creeps

Attacking bottom 1st turret

Describe hotspots transition series before destroying the key resource

Page 20: Large-Scale Platform for MOBA Game AI - NVIDIA · • [1] Silver, David, et al. "Mastering the game of Go with deep neural networks and tree search." Nature 529.7587 (2016): 484-489.

Model - Transfer Network with Macro Strategy

Key resources

Hotspots

Page 21: Large-Scale Platform for MOBA Game AI - NVIDIA · • [1] Silver, David, et al. "Mastering the game of Go with deep neural networks and tree search." Nature 529.7587 (2016): 484-489.

Model - Combat

• Multi-task on buttons ◇ Action space Directions

Skill releasing position

Page 22: Large-Scale Platform for MOBA Game AI - NVIDIA · • [1] Silver, David, et al. "Mastering the game of Go with deep neural networks and tree search." Nature 529.7587 (2016): 484-489.

Learning Framework

• Imitation + Reinforcement Learning

Page 23: Large-Scale Platform for MOBA Game AI - NVIDIA · • [1] Silver, David, et al. "Mastering the game of Go with deep neural networks and tree search." Nature 529.7587 (2016): 484-489.

Computing Platform

Page 24: Large-Scale Platform for MOBA Game AI - NVIDIA · • [1] Silver, David, et al. "Mastering the game of Go with deep neural networks and tree search." Nature 529.7587 (2016): 484-489.

• Computing Platform ◇ Computational power – large-scale CPU/GPU virtualization

◇ Learning platform – Efficient and easy-to-use platform

MOBA Game AI Platform

Millions of CPUs Thousands of GPUs

Online service Idle resource pool Online service Offline service

Docker + mixed online/offline technique Docker + GPU virtualization for shared resource

Elastic computation Kubernetes resource allocation

Tencent cloud function

Machine learning

Feature extraction Game environment

deployment Model training

Reinforcement learning

Computational units

Deployment

Resource allocation

Task Managem

ent

Service

Page 25: Large-Scale Platform for MOBA Game AI - NVIDIA · • [1] Silver, David, et al. "Mastering the game of Go with deep neural networks and tree search." Nature 529.7587 (2016): 484-489.

Computational Power

• Computational Costs

GPU CPU

MOBA AI thousands millions

CPU/GPU Demands Challenge Solution

Improve resource utilization efficiency without additional costs

The more is the better

CPU/GPU virtualization for shared resources

Page 26: Large-Scale Platform for MOBA Game AI - NVIDIA · • [1] Silver, David, et al. "Mastering the game of Go with deep neural networks and tree search." Nature 529.7587 (2016): 484-489.

CPU Virtualization

70% - Idle resource pool ◇ New resources not yet delivered ◇ Old resources not yet cleared ◇ Returned resources

30% - Idle slots in online service

◇ Online service resource usage

◇ 20%->65% using docker isolation

• Elastic and dynamic resource pool ◇ millions of CPU cores

# of CPU cores # CPU avg %

Percentage millions 20%

Elastic & Dynamic Resource Pool

Page 27: Large-Scale Platform for MOBA Game AI - NVIDIA · • [1] Silver, David, et al. "Mastering the game of Go with deep neural networks and tree search." Nature 529.7587 (2016): 484-489.

GPU Virtualization

• Goal: improve GPU usage efficiency

• Resource usage

• Optimization idea

# of GPU % of low load machines GPU avg usage

thousands 65% 28%

Page 28: Large-Scale Platform for MOBA Game AI - NVIDIA · • [1] Silver, David, et al. "Mastering the game of Go with deep neural networks and tree search." Nature 529.7587 (2016): 484-489.

GPU Virtualization [12]

Parallel share

Time-slice share

Page 29: Large-Scale Platform for MOBA Game AI - NVIDIA · • [1] Silver, David, et al. "Mastering the game of Go with deep neural networks and tree search." Nature 529.7587 (2016): 484-489.

Learning Platform

Core Technique

Version Update

Frequency

Feature extraction

Hours

Model training

One day

RL training

One day

Page 30: Large-Scale Platform for MOBA Game AI - NVIDIA · • [1] Silver, David, et al. "Mastering the game of Go with deep neural networks and tree search." Nature 529.7587 (2016): 484-489.

Learning Platform - Feature Extraction Platform

• Demand 1 ◇ Feature extraction from up to hundreds of thousands of replays

Challenge: demands up to 210 thousand CPU cores per day Solution

• CPU virtualization

• docker elastic & dynamic resource pool

• Demand 2 ◇ Multiple tasks, each with millions of entries

Challenge: Parallel task scheduling Solution: Tencent Serverless Cloud Function

Game replays

gamecore

pre

Game Raw Data Features Training samples Models Evaluation

Feature extraction shuffle Training Evaluation

Page 31: Large-Scale Platform for MOBA Game AI - NVIDIA · • [1] Silver, David, et al. "Mastering the game of Go with deep neural networks and tree search." Nature 529.7587 (2016): 484-489.

Learning Platform - Serverless Cloud Function

SDK

COS CMQ …

API

SDK

Function Call

Function Config

Function

Coordination

Function Function Function …

Application layer

Access layer

Control layer

Execution layer

Advantage of Cloud Function ◇ Function As A Service

◇ Millions of CPU cores available

◇ Free of charge in idle slots

30% of costs on average

Page 32: Large-Scale Platform for MOBA Game AI - NVIDIA · • [1] Silver, David, et al. "Mastering the game of Go with deep neural networks and tree search." Nature 529.7587 (2016): 484-489.

Learning Platform - Model Training Platform

1.Requirement ◇ Billions of samples per task ◇ Fast model training

2.Solution ◇ Multi-GPU, multi-machine ◇ Machine learning platform

3.Challenges ◇ IO Efficient data inputs Efficient computation ◇ Communication Efficient parameters exchange

Training Platform

Big Data

Result

Page 33: Large-Scale Platform for MOBA Game AI - NVIDIA · • [1] Silver, David, et al. "Mastering the game of Go with deep neural networks and tree search." Nature 529.7587 (2016): 484-489.

Model Training Platform - IO

Data IO ◇ Multiprocessing ◇ “Lock free” queue Efficient computation

◇ Data pre-caching ◇ OP speed up by multi-threading

Page 34: Large-Scale Platform for MOBA Game AI - NVIDIA · • [1] Silver, David, et al. "Mastering the game of Go with deep neural networks and tree search." Nature 529.7587 (2016): 484-489.

Model Training Platform - Communication

Parameters exchange

◇ NCCL2 [11]

Efficient communication

between GPUs

◇ RDMA

Efficient communication across

nodes

Page 35: Large-Scale Platform for MOBA Game AI - NVIDIA · • [1] Silver, David, et al. "Mastering the game of Go with deep neural networks and tree search." Nature 529.7587 (2016): 484-489.

Model Training Platform - Performance

Acceleration

Optimization results (acceleration ratio)

IO Computation Communication 0

10

20

30

40

50

60

70

Multi-GPU Multi-Machine Speed-up

1GPU 8GPUs 16GPUs 32GPUs 64GPUs

Before After Upper bound

Page 36: Large-Scale Platform for MOBA Game AI - NVIDIA · • [1] Silver, David, et al. "Mastering the game of Go with deep neural networks and tree search." Nature 529.7587 (2016): 484-489.

• Demands ◇ Hierarchical RL

Various scenarios

◇ Large-scale parallel self-play

Millions of games

◇ Automatic task management

Unified framework

Model analysis

Evaluation

Learning Platform - Reinforcement Learning Platform

Page 37: Large-Scale Platform for MOBA Game AI - NVIDIA · • [1] Silver, David, et al. "Mastering the game of Go with deep neural networks and tree search." Nature 529.7587 (2016): 484-489.

打野

清兵

团战

Jungle

Laning

Combat

RL Platform - Hierarchical RL

• Hierarchical RL

◇Scenario specific

• Solution

◇General Hierarchical RL

• Features

◇Macro task selection

◇Micro task selection

◇Effectively handles long-term planning and delayed rewards

◇Value network for guiding sub-task policy learning

Page 38: Large-Scale Platform for MOBA Game AI - NVIDIA · • [1] Silver, David, et al. "Mastering the game of Go with deep neural networks and tree search." Nature 529.7587 (2016): 484-489.

• Large-scale parallel self-play

• Solution ◇ Docker image for gamecore version management

◇ Parallel training framework

RL Platform – Parallel Training

Page 39: Large-Scale Platform for MOBA Game AI - NVIDIA · • [1] Silver, David, et al. "Mastering the game of Go with deep neural networks and tree search." Nature 529.7587 (2016): 484-489.

• Unified framework for model analysis and evaluation

◇Task submission

◇ Task start/stop

◇ Results visualization

RL Platform – Automatic task management

Reward curve 雷达图 Prediction distribution Self play results

Page 40: Large-Scale Platform for MOBA Game AI - NVIDIA · • [1] Silver, David, et al. "Mastering the game of Go with deep neural networks and tree search." Nature 529.7587 (2016): 484-489.

• Ten million scenarios per day

◇20s per scenario with 16 GPUs

• Millions of full games

◇10min+ per game with 128 GPUs

RL Platform – Performance

Page 41: Large-Scale Platform for MOBA Game AI - NVIDIA · • [1] Silver, David, et al. "Mastering the game of Go with deep neural networks and tree search." Nature 529.7587 (2016): 484-489.

Demonstration

Page 42: Large-Scale Platform for MOBA Game AI - NVIDIA · • [1] Silver, David, et al. "Mastering the game of Go with deep neural networks and tree search." Nature 529.7587 (2016): 484-489.

Visualization

Page 43: Large-Scale Platform for MOBA Game AI - NVIDIA · • [1] Silver, David, et al. "Mastering the game of Go with deep neural networks and tree search." Nature 529.7587 (2016): 484-489.

Demo – Quadra-kill Under Turret

• Micro combat ◇ Fight against mid-high level testers

◇ Killing while avoiding harm from turret

Page 44: Large-Scale Platform for MOBA Game AI - NVIDIA · • [1] Silver, David, et al. "Mastering the game of Go with deep neural networks and tree search." Nature 529.7587 (2016): 484-489.

Demo – Pentakill

• Micro combat ◇ Fight against mid-high level testers

Page 45: Large-Scale Platform for MOBA Game AI - NVIDIA · • [1] Silver, David, et al. "Mastering the game of Go with deep neural networks and tree search." Nature 529.7587 (2016): 484-489.

Demo – Transfer & Strategy

• Opening

Page 46: Large-Scale Platform for MOBA Game AI - NVIDIA · • [1] Silver, David, et al. "Mastering the game of Go with deep neural networks and tree search." Nature 529.7587 (2016): 484-489.

Demo – Transfer & Strategy

• First Dragon appears at 2:00

Page 47: Large-Scale Platform for MOBA Game AI - NVIDIA · • [1] Silver, David, et al. "Mastering the game of Go with deep neural networks and tree search." Nature 529.7587 (2016): 484-489.

Demo – Transfer & Strategy

• Besiege and Destroy the Base

Page 48: Large-Scale Platform for MOBA Game AI - NVIDIA · • [1] Silver, David, et al. "Mastering the game of Go with deep neural networks and tree search." Nature 529.7587 (2016): 484-489.

Demo – RL

After reinforcement Before reinforcement

Page 49: Large-Scale Platform for MOBA Game AI - NVIDIA · • [1] Silver, David, et al. "Mastering the game of Go with deep neural networks and tree search." Nature 529.7587 (2016): 484-489.

Summary

Page 50: Large-Scale Platform for MOBA Game AI - NVIDIA · • [1] Silver, David, et al. "Mastering the game of Go with deep neural networks and tree search." Nature 529.7587 (2016): 484-489.

• Pursue general AI via game AI research

• MOBA AI

◇ Algorithm

· Imitation + Reinforcement Learning

◇ Computing platform

· Feature extraction platform

Millions of CPUs

· Model training platform

Thousands of GPUs

· Reinforcement learning platform

Hierarchical RL

Tencent Game AI Research

Page 51: Large-Scale Platform for MOBA Game AI - NVIDIA · • [1] Silver, David, et al. "Mastering the game of Go with deep neural networks and tree search." Nature 529.7587 (2016): 484-489.

Tencent Game AI Research

• Future work ◇ Algorithm

· Tactic-level search and planning

· Multi-agent RL

◇ Computational power

· Search/planning platform

MCTS

· Reinforcement learning platform

Multi-agent RL

Page 52: Large-Scale Platform for MOBA Game AI - NVIDIA · • [1] Silver, David, et al. "Mastering the game of Go with deep neural networks and tree search." Nature 529.7587 (2016): 484-489.

2017.3 “Jueyi”Fine Art wins the UEC World Cup

About Tencent AI Lab

Our journey

2016.4 Tencent establishes its corporate-level AI Lab

2017.3 Tencent announces leading AI researcher Dr Tong ZHANG as the Director of Tencent AI Lab

2017.5 Tencent establishes its Seattle AI Lab and announces leading Speech Recognition expert Dr Dong Yu as Deputy Director

2017.11 Tencent is identified by China Ministry of Science and Technology to build national open innovation platform for AI medical imaging Today

Our team consists of 70 world-class AI scientists and

300 research engineers

Page 53: Large-Scale Platform for MOBA Game AI - NVIDIA · • [1] Silver, David, et al. "Mastering the game of Go with deep neural networks and tree search." Nature 529.7587 (2016): 484-489.

Diverse game ecosystem Game AI

Environment for AGI

China’s leading news, video, music and literature platforms Content AI Perceiving the world and

generating content

Social AI New ways to communicate WeChat: ~1 billion MAU QQ: 850 million MAU

Massive user base

Medical AI Impact and advance

industry

Building a national open innovation platform for AI medical imaging

About Tencent AI Lab

Page 54: Large-Scale Platform for MOBA Game AI - NVIDIA · • [1] Silver, David, et al. "Mastering the game of Go with deep neural networks and tree search." Nature 529.7587 (2016): 484-489.

Thank you

Page 55: Large-Scale Platform for MOBA Game AI - NVIDIA · • [1] Silver, David, et al. "Mastering the game of Go with deep neural networks and tree search." Nature 529.7587 (2016): 484-489.

References

• [1] Silver, David, et al. "Mastering the game of Go with deep neural networks and tree search." Nature 529.7587 (2016): 484-489.

• [2] Artificial Intelligence Startup Landscape Trends and Insights - Q4 2016. NOVEMBER 20, 2016 VENTURE SCANNER. https://www.venturescanner.com/blog/2016/artificial-intelligence-startup-landscape-trends-and-insights-q4-2016

• [3] Tian, Yuandong, et al. "ELF: An Extensive, Lightweight and Flexible Research Platform for Real-time Strategy Games." arXiv preprint arXiv:1707.01067 (2017).

• [4] O Vinyals et al. StarCraft II: A New Challenge for Reinforcement Learning. https://deepmind.com/research/publications/starcraft-ii-new-challenge-reinforcement-learning/. Aug. 9, 2017

• [5] “We've created an AI which beats the world's top professionals at 1v1 matches of Dota 2”. https://blog.openai.com/dota-2/

• [6] Ontanó n, Santiago, Gabriel Synnaeve, Alberto Uriarte, Florian Richoux, David Churchill, and Mike Preuss. "RTS AI: Problems and Techniques." (2015): 1-12.

• [7] Miles, Chris, and Sushil J. Louis. "Co-evolving real-time strategy game playing influence map trees with genetic algorithms." Proceedings of the International Congress on Evolutionary Computation, Portland, Oregon. IEEE Press, 2006.

• [8] Jang, Su-Hyung, and Sung-Bae Cho. "Evolving neural NPCs with layered influence map in the real-time simulation game „Conqueror‟." Computational Intelligence and Games, 2008. CIG'08. IEEE Symposium on. IEEE, 2008.

• [9] Weber, Ben George, Michael Mateas, and Arnav Jhala. "Building Human-Level AI for Real-Time Strategy Games." AAAI Fall Symposium: Advances in Cognitive Systems. Vol. 11. 2011.

• [10] Xingjian, S. H. I., et al. "Convolutional LSTM network: A machine learning approach for precipitation nowcasting." Advances in neural information processing systems. 2015.

• [11] Nathan Luehr. NCCL: ACCELERATED COLLECTIVE COMMUNICATIONS FOR GPUS. April 5, 2016. GPU Technology Conference 2016.

• [12] CUDA MULTI-PROCESS SERVICE. https://docs.nvidia.com/deploy/pdf/CUDA_Multi_Process_Service_Overview.pdf.