NINTENDO LEARNING ENVIRONMENT - 東京大学

14
NINTENDO LEARNING ENVIRONMENT difficulty transferability KAMIL M ROCKI

Transcript of NINTENDO LEARNING ENVIRONMENT - 東京大学

NINTENDO LEARNING ENVIRONMENT

difficulty

transferability

KAMIL M ROCKI

NINTENDO LEARNING ENVIRONMENT

difficulty

transferability

KAMIL M ROCKI

GAMEBOY NES~1500 GAMES ~1000 GAMES

KAMIL M ROCKI

GAMEBOY NES~1500 GAMES ~1000 GAMES

KAMIL M ROCKI

ALGORITHMS = DATA + COMPUTE

KAMIL M ROCKI

PERFORMANCE 1ST EMULATORS: WE NEED MORE DATA!

GPU

CPU

https://towardsdatascience.com/a-gameboy-supercomputer-33a6955a79a4up to 1B FPS

ALGORITHMS = DATA + COMPUTE

PYTHON FRONTEND

FPGA KAMIL M ROCKI

PERFORMANCE 1ST EMULATORS: WE NEED MORE DATA!

GPU

CPU

https://towardsdatascience.com/a-gameboy-supercomputer-33a6955a79a4up to 1B FPS

ALGORITHMS = DATA + COMPUTE

PYTHON FRONTEND

FPGA KAMIL M ROCKI

A3C BASELINEHARD

WORKSKAMIL M ROCKI

A3C BASELINEHARD

WORKSKAMIL M ROCKI

A3C BASELINE SETUP

Long-short-term memory (temporal compression)

5-layer convolutional neural network (spatial compression)

160x144

5x5x32

1x256

Reinforcement learning algorithm

A3C: Asynchronous Advantage Actor Critic

KAMIL M ROCKI

INTERNAL WORLD MODEL?PRO: DECOUPLE THE MODEL (HIGH FRAMERATE) FROM THE CONTROLLER

COMPRESS

KAMIL M ROCKI

RELATE

INTERNAL WORLD MODEL?

KAMIL M ROCKI

RELATE

INTERNAL WORLD MODEL?

KAMIL M ROCKI

RELATE

INTERNAL WORLD MODEL?

KAMIL M ROCKI