pytorchrl
stable

Getting started:

  • Motivation
  • Installation

PyTorchRL API:

  • Agent Components
    • Actors
    • Algorithms
      • Off-policy
      • On-policy
      • Model-Based
    • Available Environments
    • Environment Vectors
    • Storage
  • Training Components

Tutorials:

  • Breaking down PyTorchRL
  • Create a custom environment

Code examples:

  • Simplified Code Examples
  • Unity 3D Obstacle Tower Environment
pytorchrl
  • »
  • Agent Components »
  • Algorithms
  • Edit on GitHub

Algorithms

  • Off-policy
    • Double Deep Q-Learning (DDQN)
    • Deep Deterministic Policy Gradient (DDPG)
    • Twin Delayed Deep Deterministic (TD3)
    • Soft Actor Critic (SAC)
    • Maximum a Posteriori Policy Optimization (MPO)
  • On-policy
    • Advantage Actor Critic (A2C)
    • Proximal Policy Optimization (PPO)
    • Proximal Policy Optimization (PPO) with Random Network Distillation (RND)
  • Model-Based
    • Model Predictive Control (MPC) Random Shooting (RS)
    • Model Predictive Control (MPC) Cross-Entropy Method (CEM)
    • Model Predictive Control (MPC) Deep Dynamics Models (PDDM)
Previous Next

© Copyright 2020, pytorchrl. Revision ec0c319c.

Built with Sphinx using a theme provided by Read the Docs.