pytorchrl.agent.actors.reward_functions package
Submodules
pytorchrl.agent.actors.reward_functions.gym_reward_functions module
- pytorchrl.agent.actors.reward_functions.gym_reward_functions.angle_normalize(x: torch.Tensor)[source]
- pytorchrl.agent.actors.reward_functions.gym_reward_functions.cartpole(state: torch.Tensor, action: torch.Tensor, next_state: torch.Tensor) torch.Tensor[source]
Based on https://arxiv.org/pdf/1907.02057.pdf reward = cos(θ_t) - 0.01x²
- pytorchrl.agent.actors.reward_functions.gym_reward_functions.halfcheetah_mujoco(state: torch.Tensor, action: torch.Tensor, next_state: torch.Tensor) torch.Tensor[source]
First 8 values in the state are position data other 9 are position velocities (x,y,z) and rest angular -> idx 8 is x_velocitiy
- pytorchrl.agent.actors.reward_functions.gym_reward_functions.inverted_pendulum_mujoco(state: torch.Tensor, action: torch.Tensor, next_state: torch.Tensor) torch.Tensor[source]
Env info: https://github.com/openai/gym/blob/master/gym/envs/mujoco/inverted_pendulum.py Reward function based on: https://arxiv.org/pdf/1907.02057.pdf
reward = - theta², where theta = state[1]
pytorchrl.agent.actors.reward_functions.pybullet_reward_functions module
- pytorchrl.agent.actors.reward_functions.pybullet_reward_functions.halfcheetah_bullet(state: torch.Tensor, action: torch.Tensor, next_state: torch.Tensor) torch.Tensor[source]
HalfCheetahBulletEnv-v0 velocity is 3 idx: https://github.com/bulletphysics/bullet3/blob/478da7469a34074aa051e8720734287ca371fd3e/examples/pybullet/gym/pybullet_envs/robot_locomotors.py#L64