Reward Functions
Gym CartPole
- gym_reward_functions.cartpole(action: torch.Tensor, next_state: torch.Tensor) torch.Tensor
Based on https://arxiv.org/pdf/1907.02057.pdf reward = cos(θ_t) - 0.01x²
Gym Pendulum
- gym_reward_functions.pendulum(action: torch.Tensor, next_state: torch.Tensor) torch.Tensor
MuJoCO Inverted Pendulum
- gym_reward_functions.inverted_pendulum_mujoco(action: torch.Tensor, next_state: torch.Tensor) torch.Tensor
Env info: https://github.com/openai/gym/blob/master/gym/envs/mujoco/inverted_pendulum.py Reward function based on: https://arxiv.org/pdf/1907.02057.pdf
reward = - theta², where theta = state[1]
MuJoCo HalfCheetah
- gym_reward_functions.halfcheetah_mujoco(action: torch.Tensor, next_state: torch.Tensor) torch.Tensor
First 8 values in the state are position data other 9 are position velocities (x,y,z) and rest angular -> idx 8 is x_velocitiy
PyBullet HalfCheetah
- pybullet_reward_functions.halfcheetah_bullet(action: torch.Tensor, next_state: torch.Tensor) torch.Tensor
HalfCheetahBulletEnv-v0 velocity is 3 idx: https://github.com/bulletphysics/bullet3/blob/478da7469a34074aa051e8720734287ca371fd3e/examples/pybullet/gym/pybullet_envs/robot_locomotors.py#L64