pytorchrl.agent.actors.reward_functions package

Submodules

pytorchrl.agent.actors.reward_functions.gym_reward_functions module

pytorchrl.agent.actors.reward_functions.gym_reward_functions.angle_normalize(x: torch.Tensor)[source]
pytorchrl.agent.actors.reward_functions.gym_reward_functions.cartpole(state: torch.Tensor, action: torch.Tensor, next_state: torch.Tensor) torch.Tensor[source]

Based on https://arxiv.org/pdf/1907.02057.pdf reward = cos(θ_t) - 0.01x²

pytorchrl.agent.actors.reward_functions.gym_reward_functions.halfcheetah_mujoco(state: torch.Tensor, action: torch.Tensor, next_state: torch.Tensor) torch.Tensor[source]

First 8 values in the state are position data other 9 are position velocities (x,y,z) and rest angular -> idx 8 is x_velocitiy

pytorchrl.agent.actors.reward_functions.gym_reward_functions.inverted_pendulum_mujoco(state: torch.Tensor, action: torch.Tensor, next_state: torch.Tensor) torch.Tensor[source]

Env info: https://github.com/openai/gym/blob/master/gym/envs/mujoco/inverted_pendulum.py Reward function based on: https://arxiv.org/pdf/1907.02057.pdf

reward = - theta², where theta = state[1]

pytorchrl.agent.actors.reward_functions.gym_reward_functions.pendulum(state: torch.Tensor, action: torch.Tensor, next_state: torch.Tensor) torch.Tensor[source]

pytorchrl.agent.actors.reward_functions.pybullet_reward_functions module

pytorchrl.agent.actors.reward_functions.pybullet_reward_functions.halfcheetah_bullet(state: torch.Tensor, action: torch.Tensor, next_state: torch.Tensor) torch.Tensor[source]

HalfCheetahBulletEnv-v0 velocity is 3 idx: https://github.com/bulletphysics/bullet3/blob/478da7469a34074aa051e8720734287ca371fd3e/examples/pybullet/gym/pybullet_envs/robot_locomotors.py#L64

Module contents

pytorchrl.agent.actors.reward_functions.get_reward_function(env_id)[source]