![]() IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures (model architecture) GAE: High-Dimensional Continuous Control Using Generalized Advantage Estimation PPO: Human-level control through deep reinforcement learning Papers related to this implementation are: ![]() It also supports to evaluate the model with visual display.Īfter training the agent with 100M frames, agent can easily solve the stages upto 13. This repository contains code to train agent in Bubble-Bobble with several implementation tricks and modifications applied into Proximal Policy Optimization algorithm. Bubble-Bobble with Proximal Policy Optimization Introduction
0 Comments
Leave a Reply. |
Details
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |