pytorchrl.agent.actors.distributions package

Submodules

pytorchrl.agent.actors.distributions.categorical module

class pytorchrl.agent.actors.distributions.categorical.Categorical(num_inputs, num_outputs)[source]

Bases: torch.nn.modules.module.Module

Categorical probability distribution.

Parameters
  • num_inputs (int) – Size of input feature maps.

  • num_outputs (int) – Number of options in output space.

evaluate_pred(x, pred)[source]

Return log prob of pred under the distribution generated from x (obs features). Also return entropy of the generated distribution.

Parameters
  • x (torch.tensor) – obs feature map obtained from a policy_net.

  • pred (torch.tensor) – Prediction to evaluate.

Returns

  • logp (torch.tensor) – Log probability of pred according to the predicted distribution.

  • entropy_dist (torch.tensor) – Entropy of the predicted distribution.

  • dist (torch.Distribution) – Action probability distribution.

forward(x, deterministic=False)[source]

Predict distribution parameters from x (obs features) and return predictions (sampled and clipped), sampled log probability and distribution entropy.

Parameters
  • x (torch.tensor) – Feature maps extracted from environment observations.

  • deterministic (bool) – Whether to randomly sample from predicted distribution or take the mode.

Returns

  • pred (torch.tensor) – Predicted value.

  • clipped_pred (torch.tensor) – Predicted value (clipped to be within [-1, 1] range).

  • logp (torch.tensor) – Log probability of pred according to the predicted distribution.

  • entropy_dist (torch.tensor) – Entropy of the predicted distribution.

  • dist (torch.Distribution) – Action probability distribution.

training: bool

pytorchrl.agent.actors.distributions.deterministic module

class pytorchrl.agent.actors.distributions.deterministic.Deterministic(num_inputs, num_outputs, noise)[source]

Bases: torch.nn.modules.module.Module

Deterministic prediction of the mean value mu of a learned action distribtion.

Parameters
  • num_inputs (int) – Size of input feature maps.

  • num_outputs (int) – Number of dims in output space.

  • noise (str) – Type of noise that is added to the predicted mu.

evaluate_pred(x, pred)[source]

Predict distribution parameters from x (obs features) and returns predicted mu value of the distribution. Ignores the pred input parameter.

Parameters
  • x (torch.tensor) – Feature maps extracted from environment observations.

  • pred (torch.tensor) – Prediction to evaluate.

Returns

  • logp (torch.tensor) –

    Log probability of pred according to the predicted

    distribution.

  • entropy_dist (torch.tensor) – Entropy of the predicted distribution.

  • dist (torch.Distribution) – Action probability distribution.

forward(x, deterministic=True)[source]

Predict distribution parameters from x (obs features) and returns predicted noisy action mu of the distribution and the clipped action [-1, 1].

Parameters
  • x (torch.tensor) – Feature maps extracted from environment observations.

  • deterministic (bool) – Whether to noise is added to the predicted mu or not.

Returns

  • action (torch.tensor) – Next action sampled.

  • clipped_action (torch.tensor) – Next action sampled, but clipped to be within the env action space.

  • logp (None) – Returns logp ‘None’ to have equal output to other distributions.

  • entropy_dist (None) – Returns logp ‘None’ to have equal output to other distributions

  • dist (torch.Distribution) – Action probability distribution.

training: bool
class pytorchrl.agent.actors.distributions.deterministic.DeterministicMB(num_inputs: int, num_outputs: int)[source]

Bases: torch.nn.modules.module.Module

Deterministic ensemble output layer

Parameters
  • num_inputs (int) – Size of input feature maps.

  • num_outputs (int) – Output size of the gaussian layer.

  • ensemble_size (int) – Ensemble size in the output layer.

forward(x: torch.Tensor) torch.Tensor[source]

Forward pass

training: bool

pytorchrl.agent.actors.distributions.gaussian module

class pytorchrl.agent.actors.distributions.gaussian.DiagGaussian(num_inputs, num_outputs, predict_log_std=False)[source]

Bases: torch.nn.modules.module.Module

Isotropic gaussian probability distribution.

Parameters
  • num_inputs (int) – Size of input feature maps.

  • num_outputs (int) – Number of dims in output space.

  • predict_log_std (bool) – Whether to use a nn.linear layer to predict the output std.

evaluate_pred(x, pred)[source]

Return log prob of pred under the distribution generated from x (obs features). Also return entropy of the generated distribution.

Parameters
  • x (torch.tensor) – obs feature map obtained from a policy_net.

  • pred (torch.tensor) – Prediction to evaluate.

Returns

  • logp (torch.tensor) – Log probability of pred according to the predicted distribution.

  • entropy_dist (torch.tensor) – Entropy of the predicted distribution.

  • dist (torch.Distribution) – Action probability distribution.

forward(x, deterministic=False)[source]

Predict distribution parameters from x (obs features) and return predicted values (sampled and clipped), sampled log probability and distribution entropy.

Parameters
  • x (torch.tensor) – Feature maps extracted from environment observations.

  • deterministic (bool) – Whether to randomly sample from predicted distribution or take the mode.

Returns

  • pred (torch.tensor) – Predicted value.

  • clipped_pred (torch.tensor) – Predicted value (clipped to be within [-1, 1] range).

  • logp (torch.tensor) – Log probability of pred according to the predicted distribution.

  • entropy_dist (torch.tensor) – Entropy of the predicted distribution.

  • dist (torch.Distribution) – Action probability distribution.

training: bool
class pytorchrl.agent.actors.distributions.gaussian.DiagGaussianEnsemble(num_inputs: int, num_outputs: int, ensemble_size: int)[source]

Bases: torch.nn.modules.module.Module

Ensemble gaussian output layer

Parameters
  • num_inputs (int) – Size of input feature maps.

  • num_outputs (int) – Output size of the gaussian layer.

  • ensemble_size (int) – Ensemble size in the output layer.

forward(x: torch.Tensor)[source]

Forward pass

training: bool

pytorchrl.agent.actors.distributions.squashed_gaussian module

class pytorchrl.agent.actors.distributions.squashed_gaussian.SquashedGaussian(num_inputs, num_outputs, predict_log_std=True)[source]

Bases: torch.nn.modules.module.Module

Squashed Gaussian probability distribution.

Parameters
  • num_inputs (int) – Size of input feature maps.

  • num_outputs (int) – Number of dims in output space.

  • predict_log_std (bool) – Whether to use a nn.linear layer to predict the output std.

evaluate_pred(x, pred)[source]

Return log prob of pred under the distribution generated from x (obs features). Also return entropy of the generated distribution.

Parameters
  • x (torch.tensor) – obs feature map obtained from a policy_net.

  • pred (torch.tensor) – Prediction to evaluate.

Returns

  • logp (torch.tensor) – Log probability of pred according to the predicted distribution.

  • entropy_dist (torch.tensor) – Entropy of the predicted distribution.

  • dist (torch.Distribution) – Action probability distribution.

forward(x, deterministic=False)[source]

Predict distribution parameters from x (obs features) and return predicted values (sampled and clipped), sampled log probability and distribution entropy.

Parameters
  • x (torch.tensor) – Feature maps extracted from environment observations.

  • deterministic (bool) – Whether to randomly sample from predicted distribution or take the mode.

Returns

  • pred (torch.tensor) – Predicted value.

  • clipped_pred (torch.tensor) – Predicted value (clipped to be within [-1, 1] range).

  • logp (torch.tensor) – Log probability of pred according to the predicted distribution.

  • entropy_dist (torch.tensor) – Entropy of the predicted distribution.

  • dist (torch.Distribution) – Action probability distribution.

training: bool

Module contents

pytorchrl.agent.actors.distributions.get_dist(name)[source]

Returns model class from name.