pytorchrl.agent.actors.world_models package

Submodules

pytorchrl.agent.actors.world_models.utils module

class pytorchrl.agent.actors.world_models.utils.StandardScaler(device)[source]

Bases: object

fit(inputs, targets)[source]

Runs two ops, one for assigning the mean of the data to the internal mean, and another for assigning the standard deviation of the data to the internal standard deviation. This function must be called within a ‘with <session>.as_default()’ block.

Parameters
  • inputs (torch.Tensor) – A torch Tensor containing the input

  • targets (torch.Tensor) – A torch Tensor containing the input

inverse_transform(targets)[source]

Undoes the transformation performed by this scaler.

Parameters

targets (torch.Tensor) – A torch Tensor containing the points to be transformed.

Returns

output – The transformed dataset.

Return type

torch.Tensor

transform(inputs, targets=None)[source]

Transforms the input matrix data using the parameters of this scaler.

Parameters
  • inputs (torch.Tensor) – A torch Tensor containing the points to be transformed.

  • targets (torch.Tensor) – A torch Tensor containing the points to be transformed.

Returns

  • norm_inputs (torch.Tensor) – Normalized inputs

  • norm_targets (torch.Tensor) – Normalized targets

pytorchrl.agent.actors.world_models.world_model module

class pytorchrl.agent.actors.world_models.world_model.WorldModel(device, input_space, action_space, standard_scaler, hidden_size=64, reward_function=None)[source]

Bases: torch.nn.modules.module.Module

Model-Based Actor class for Model-Based algorithms.

It contains the dynamics network to predict the next state (and reward if selected).

Parameters
  • input_space (gym.Space) – Environment observation space.

  • action_space (gym.Space) – Environment action space.

  • hidden_size (int) – Hidden size number.

  • standard_scaler (StandardScaler) – StandardScaler class instance.

  • reward_function (func) – Reward function to be learned.

static check_dynamics_weights(parameter1, parameter2)[source]
create_dynamics()[source]

Create a dynamics model and define it as class attribute under the name name.

Parameters

name (str) – dynamics model name.

predict_given_reward(states: torch.Tensor, actions: torch.Tensor) Tuple[torch.Tensor, torch.Tensor][source]

Does the next state prediction and calculates the reward given a reward function.

Parameters
  • states (torch.Tensor) – Current state s

  • actions (torch.Tensor) – Action taken in state s

Returns

  • next_states (torch.Tensor) – Next states.

  • rewards (torch.Tensor) – Calculated reward.

predict_learned_reward(states: torch.Tensor, actions: torch.Tensor) Tuple[torch.Tensor, torch.Tensor][source]

Does the next state prediction and reward prediction with a learn reward function.

Parameters
  • states (torch.Tensor) – Current state s

  • actions (torch.Tensor) – Action taken in state s

Returns

  • next_states (torch.Tensor) – Next states.

  • rewards (torch.Tensor) – Reward prediction.

reinitialize_dynamics_model()[source]

Re-initializes the dynamics model, can be done before each new Model learning run. Might help in some environments to overcome over-fitting of the model!

training: bool

Module contents