NINTENDO LEARNING ENVIRONMENT - 東京大学
14
NINTENDO LEARNING ENVIRONMENT difficulty transferability KAMIL M ROCKI
Transcript of NINTENDO LEARNING ENVIRONMENT - 東京大学
PERFORMANCE 1ST EMULATORS: WE NEED MORE DATA!
GPU
CPU
https://towardsdatascience.com/a-gameboy-supercomputer-33a6955a79a4up to 1B FPS
ALGORITHMS = DATA + COMPUTE
PYTHON FRONTEND
FPGA KAMIL M ROCKI
PERFORMANCE 1ST EMULATORS: WE NEED MORE DATA!
GPU
CPU
https://towardsdatascience.com/a-gameboy-supercomputer-33a6955a79a4up to 1B FPS
ALGORITHMS = DATA + COMPUTE
PYTHON FRONTEND
FPGA KAMIL M ROCKI
A3C BASELINE SETUP
Long-short-term memory (temporal compression)
5-layer convolutional neural network (spatial compression)
160x144
5x5x32
1x256
Reinforcement learning algorithm
A3C: Asynchronous Advantage Actor Critic
KAMIL M ROCKI
INTERNAL WORLD MODEL?PRO: DECOUPLE THE MODEL (HIGH FRAMERATE) FROM THE CONTROLLER
COMPRESS
KAMIL M ROCKI