pytorchrl.agent.actors.distributions package

Submodules

pytorchrl.agent.actors.distributions.categorical module

class pytorchrl.agent.actors.distributions.categorical.Categorical(num_inputs, num_outputs)[source]

Bases: torch.nn.modules.module.Module

Categorical probability distribution.

Parameters

num_inputs (int) – Size of input feature maps.
num_outputs (int) – Number of options in output space.

evaluate_pred(x, pred)[source]

Return log prob of pred under the distribution generated from x (obs features). Also return entropy of the generated distribution.

Parameters

x (torch.tensor) – obs feature map obtained from a policy_net.
pred (torch.tensor) – Prediction to evaluate.

Returns

logp (torch.tensor) – Log probability of pred according to the predicted distribution.
entropy_dist (torch.tensor) – Entropy of the predicted distribution.
dist (torch.Distribution) – Action probability distribution.

forward(x, deterministic=False)[source]

Predict distribution parameters from x (obs features) and return predictions (sampled and clipped), sampled log probability and distribution entropy.

Parameters

x (torch.tensor) – Feature maps extracted from environment observations.
deterministic (bool) – Whether to randomly sample from predicted distribution or take the mode.

Returns

pred (torch.tensor) – Predicted value.
clipped_pred (torch.tensor) – Predicted value (clipped to be within [-1, 1] range).
logp (torch.tensor) – Log probability of pred according to the predicted distribution.
entropy_dist (torch.tensor) – Entropy of the predicted distribution.
dist (torch.Distribution) – Action probability distribution.

training: bool

pytorchrl.agent.actors.distributions.deterministic module

class pytorchrl.agent.actors.distributions.deterministic.Deterministic(num_inputs, num_outputs, noise)[source]

Bases: torch.nn.modules.module.Module

Deterministic prediction of the mean value mu of a learned action distribtion.

Parameters

num_inputs (int) – Size of input feature maps.
num_outputs (int) – Number of dims in output space.
noise (str) – Type of noise that is added to the predicted mu.

evaluate_pred(x, pred)[source]

Predict distribution parameters from x (obs features) and returns predicted mu value of the distribution. Ignores the pred input parameter.

Parameters

x (torch.tensor) – Feature maps extracted from environment observations.
pred (torch.tensor) – Prediction to evaluate.

Returns

logp (torch.tensor) –

Log probability of pred according to the predicted
distribution.
entropy_dist (torch.tensor) – Entropy of the predicted distribution.
dist (torch.Distribution) – Action probability distribution.

forward(x, deterministic=True)[source]

Predict distribution parameters from x (obs features) and returns predicted noisy action mu of the distribution and the clipped action [-1, 1].

Parameters

x (torch.tensor) – Feature maps extracted from environment observations.
deterministic (bool) – Whether to noise is added to the predicted mu or not.

Returns

action (torch.tensor) – Next action sampled.
clipped_action (torch.tensor) – Next action sampled, but clipped to be within the env action space.
logp (None) – Returns logp ‘None’ to have equal output to other distributions.
entropy_dist (None) – Returns logp ‘None’ to have equal output to other distributions
dist (torch.Distribution) – Action probability distribution.

training: bool

class pytorchrl.agent.actors.distributions.deterministic.DeterministicMB(num_inputs: int, num_outputs: int)[source]

Bases: torch.nn.modules.module.Module

Deterministic ensemble output layer

Parameters

num_inputs (int) – Size of input feature maps.
num_outputs (int) – Output size of the gaussian layer.
ensemble_size (int) – Ensemble size in the output layer.

forward(x: torch.Tensor) → torch.Tensor[source]: Forward pass

training: bool

pytorchrl.agent.actors.distributions.gaussian module

class pytorchrl.agent.actors.distributions.gaussian.DiagGaussian(num_inputs, num_outputs, predict_log_std=False)[source]

Bases: torch.nn.modules.module.Module

Isotropic gaussian probability distribution.

Parameters

num_inputs (int) – Size of input feature maps.
num_outputs (int) – Number of dims in output space.
predict_log_std (bool) – Whether to use a nn.linear layer to predict the output std.

evaluate_pred(x, pred)[source]

Return log prob of pred under the distribution generated from x (obs features). Also return entropy of the generated distribution.

Parameters

x (torch.tensor) – obs feature map obtained from a policy_net.
pred (torch.tensor) – Prediction to evaluate.

Returns

logp (torch.tensor) – Log probability of pred according to the predicted distribution.
entropy_dist (torch.tensor) – Entropy of the predicted distribution.
dist (torch.Distribution) – Action probability distribution.

forward(x, deterministic=False)[source]

Predict distribution parameters from x (obs features) and return predicted values (sampled and clipped), sampled log probability and distribution entropy.

Parameters

x (torch.tensor) – Feature maps extracted from environment observations.
deterministic (bool) – Whether to randomly sample from predicted distribution or take the mode.

Returns

pred (torch.tensor) – Predicted value.
clipped_pred (torch.tensor) – Predicted value (clipped to be within [-1, 1] range).
logp (torch.tensor) – Log probability of pred according to the predicted distribution.
entropy_dist (torch.tensor) – Entropy of the predicted distribution.
dist (torch.Distribution) – Action probability distribution.

training: bool

class pytorchrl.agent.actors.distributions.gaussian.DiagGaussianEnsemble(num_inputs: int, num_outputs: int, ensemble_size: int)[source]

Bases: torch.nn.modules.module.Module

Ensemble gaussian output layer

Parameters

num_inputs (int) – Size of input feature maps.
num_outputs (int) – Output size of the gaussian layer.
ensemble_size (int) – Ensemble size in the output layer.

forward(x: torch.Tensor)[source]: Forward pass

training: bool

pytorchrl.agent.actors.distributions.squashed_gaussian module

class pytorchrl.agent.actors.distributions.squashed_gaussian.SquashedGaussian(num_inputs, num_outputs, predict_log_std=True)[source]

Bases: torch.nn.modules.module.Module

Squashed Gaussian probability distribution.

Parameters

num_inputs (int) – Size of input feature maps.
num_outputs (int) – Number of dims in output space.
predict_log_std (bool) – Whether to use a nn.linear layer to predict the output std.

evaluate_pred(x, pred)[source]

Return log prob of pred under the distribution generated from x (obs features). Also return entropy of the generated distribution.

Parameters

x (torch.tensor) – obs feature map obtained from a policy_net.
pred (torch.tensor) – Prediction to evaluate.

Returns

logp (torch.tensor) – Log probability of pred according to the predicted distribution.
entropy_dist (torch.tensor) – Entropy of the predicted distribution.
dist (torch.Distribution) – Action probability distribution.

forward(x, deterministic=False)[source]

Predict distribution parameters from x (obs features) and return predicted values (sampled and clipped), sampled log probability and distribution entropy.

Parameters

x (torch.tensor) – Feature maps extracted from environment observations.
deterministic (bool) – Whether to randomly sample from predicted distribution or take the mode.

Returns

pred (torch.tensor) – Predicted value.
clipped_pred (torch.tensor) – Predicted value (clipped to be within [-1, 1] range).
logp (torch.tensor) – Log probability of pred according to the predicted distribution.
entropy_dist (torch.tensor) – Entropy of the predicted distribution.
dist (torch.Distribution) – Action probability distribution.

training: bool

Module contents

pytorchrl.agent.actors.distributions.get_dist(name)[source]: Returns model class from name.