pytorchrl
latest
Getting started:
Motivation
Installation
PyTorchRL API:
Agent Components
Actors
Algorithms
Available Environments
Environment Vectors
Storage
Training Components
Tutorials:
Breaking down PyTorchRL
Create a custom environment
Code examples:
Simplified Code Examples
Unity 3D Obstacle Tower Environment
pytorchrl
»
Agent Components
Edit on GitHub
Agent Components
Actors
Off-Policy Actor
On-Policy Actor
Model-Based Actor
Actor Components
Action prob. distributions
Feature Extractors
Memory Networks
Noise
Reward Functions
World Models
Algorithms
Off-policy
Double Deep Q-Learning (DDQN)
Deep Deterministic Policy Gradient (DDPG)
Twin Delayed Deep Deterministic (TD3)
Soft Actor Critic (SAC)
Maximum a Posteriori Policy Optimization (MPO)
On-policy
Advantage Actor Critic (A2C)
Proximal Policy Optimization (PPO)
Proximal Policy Optimization (PPO) with Random Network Distillation (RND)
Model-Based
Model Predictive Control (MPC) Random Shooting (RS)
Model Predictive Control (MPC) Cross-Entropy Method (CEM)
Model Predictive Control (MPC) Deep Dynamics Models (PDDM)
Available Environments
Environment Vectors
VecEnv
Storage
Off-policy
Replay buffer
N-Step Replay Buffer
Prioritized Experience Replay Buffer
Emphasizing Recent Experience Replay Buffer (ERE)
Hindsight experience replay buffer (HER)
On-Policy
Vanilla On-Policy Buffer
Generalized Advantage Estimator (GAE) Buffer
V-trace Buffer
Proximal Policy Optimization with Demonstrations Buffer (PPOD)
Model-Based
ModelBased Replay Buffer
Read the Docs
v: latest
Versions
latest
stable
Downloads
On Read the Docs
Project Home
Builds