pytorchrl.agent.actors.world_models package

Submodules

pytorchrl.agent.actors.world_models.utils module

class pytorchrl.agent.actors.world_models.utils.StandardScaler(device)[source]

Bases: object

fit(inputs, targets)[source]

Runs two ops, one for assigning the mean of the data to the internal mean, and another for assigning the standard deviation of the data to the internal standard deviation. This function must be called within a ‘with <session>.as_default()’ block.

Parameters

inputs (torch.Tensor) – A torch Tensor containing the input
targets (torch.Tensor) – A torch Tensor containing the input

inverse_transform(targets)[source]

Undoes the transformation performed by this scaler.

Parameters: targets (torch.Tensor) – A torch Tensor containing the points to be transformed.
Returns: output – The transformed dataset.
Return type: torch.Tensor

transform(inputs, targets=None)[source]

Transforms the input matrix data using the parameters of this scaler.

Parameters

inputs (torch.Tensor) – A torch Tensor containing the points to be transformed.
targets (torch.Tensor) – A torch Tensor containing the points to be transformed.

Returns

norm_inputs (torch.Tensor) – Normalized inputs
norm_targets (torch.Tensor) – Normalized targets

pytorchrl.agent.actors.world_models.world_model module

class pytorchrl.agent.actors.world_models.world_model.WorldModel(device, input_space, action_space, standard_scaler, hidden_size=64, reward_function=None)[source]

Bases: torch.nn.modules.module.Module

Model-Based Actor class for Model-Based algorithms.

It contains the dynamics network to predict the next state (and reward if selected).

Parameters

input_space (gym.Space) – Environment observation space.
action_space (gym.Space) – Environment action space.
hidden_size (int) – Hidden size number.
standard_scaler (StandardScaler) – StandardScaler class instance.
reward_function (func) – Reward function to be learned.

static check_dynamics_weights(parameter1, parameter2)[source]

create_dynamics()[source]

Create a dynamics model and define it as class attribute under the name name.

Parameters: name (str) – dynamics model name.

predict_given_reward(states: torch.Tensor, actions: torch.Tensor) → Tuple[torch.Tensor, torch.Tensor][source]

Does the next state prediction and calculates the reward given a reward function.

Parameters

states (torch.Tensor) – Current state s
actions (torch.Tensor) – Action taken in state s

Returns

next_states (torch.Tensor) – Next states.
rewards (torch.Tensor) – Calculated reward.

predict_learned_reward(states: torch.Tensor, actions: torch.Tensor) → Tuple[torch.Tensor, torch.Tensor][source]

Does the next state prediction and reward prediction with a learn reward function.

Parameters

states (torch.Tensor) – Current state s
actions (torch.Tensor) – Action taken in state s

Returns

next_states (torch.Tensor) – Next states.
rewards (torch.Tensor) – Reward prediction.

reinitialize_dynamics_model()[source]: Re-initializes the dynamics model, can be done before each new Model learning run. Might help in some environments to overcome over-fitting of the model!

training: bool

pytorchrl.agent.actors.world_models package

Submodules

pytorchrl.agent.actors.world_models.utils module

pytorchrl.agent.actors.world_models.world_model module

Module contents