pytorchrl.agent.actors.distributions package
Submodules
pytorchrl.agent.actors.distributions.categorical module
- class pytorchrl.agent.actors.distributions.categorical.Categorical(num_inputs, num_outputs)[source]
Bases:
torch.nn.modules.module.ModuleCategorical probability distribution.
- Parameters
num_inputs (int) – Size of input feature maps.
num_outputs (int) – Number of options in output space.
- evaluate_pred(x, pred)[source]
Return log prob of pred under the distribution generated from x (obs features). Also return entropy of the generated distribution.
- Parameters
x (torch.tensor) – obs feature map obtained from a policy_net.
pred (torch.tensor) – Prediction to evaluate.
- Returns
logp (torch.tensor) – Log probability of pred according to the predicted distribution.
entropy_dist (torch.tensor) – Entropy of the predicted distribution.
dist (torch.Distribution) – Action probability distribution.
- forward(x, deterministic=False)[source]
Predict distribution parameters from x (obs features) and return predictions (sampled and clipped), sampled log probability and distribution entropy.
- Parameters
x (torch.tensor) – Feature maps extracted from environment observations.
deterministic (bool) – Whether to randomly sample from predicted distribution or take the mode.
- Returns
pred (torch.tensor) – Predicted value.
clipped_pred (torch.tensor) – Predicted value (clipped to be within [-1, 1] range).
logp (torch.tensor) – Log probability of pred according to the predicted distribution.
entropy_dist (torch.tensor) – Entropy of the predicted distribution.
dist (torch.Distribution) – Action probability distribution.
- training: bool
pytorchrl.agent.actors.distributions.deterministic module
- class pytorchrl.agent.actors.distributions.deterministic.Deterministic(num_inputs, num_outputs, noise)[source]
Bases:
torch.nn.modules.module.ModuleDeterministic prediction of the mean value mu of a learned action distribtion.
- Parameters
num_inputs (int) – Size of input feature maps.
num_outputs (int) – Number of dims in output space.
noise (str) – Type of noise that is added to the predicted mu.
- evaluate_pred(x, pred)[source]
Predict distribution parameters from x (obs features) and returns predicted mu value of the distribution. Ignores the pred input parameter.
- Parameters
x (torch.tensor) – Feature maps extracted from environment observations.
pred (torch.tensor) – Prediction to evaluate.
- Returns
logp (torch.tensor) –
- Log probability of pred according to the predicted
distribution.
entropy_dist (torch.tensor) – Entropy of the predicted distribution.
dist (torch.Distribution) – Action probability distribution.
- forward(x, deterministic=True)[source]
Predict distribution parameters from x (obs features) and returns predicted noisy action mu of the distribution and the clipped action [-1, 1].
- Parameters
x (torch.tensor) – Feature maps extracted from environment observations.
deterministic (bool) – Whether to noise is added to the predicted mu or not.
- Returns
action (torch.tensor) – Next action sampled.
clipped_action (torch.tensor) – Next action sampled, but clipped to be within the env action space.
logp (None) – Returns logp ‘None’ to have equal output to other distributions.
entropy_dist (None) – Returns logp ‘None’ to have equal output to other distributions
dist (torch.Distribution) – Action probability distribution.
- training: bool
- class pytorchrl.agent.actors.distributions.deterministic.DeterministicMB(num_inputs: int, num_outputs: int)[source]
Bases:
torch.nn.modules.module.ModuleDeterministic ensemble output layer
- Parameters
num_inputs (int) – Size of input feature maps.
num_outputs (int) – Output size of the gaussian layer.
ensemble_size (int) – Ensemble size in the output layer.
- training: bool
pytorchrl.agent.actors.distributions.gaussian module
- class pytorchrl.agent.actors.distributions.gaussian.DiagGaussian(num_inputs, num_outputs, predict_log_std=False)[source]
Bases:
torch.nn.modules.module.ModuleIsotropic gaussian probability distribution.
- Parameters
num_inputs (int) – Size of input feature maps.
num_outputs (int) – Number of dims in output space.
predict_log_std (bool) – Whether to use a nn.linear layer to predict the output std.
- evaluate_pred(x, pred)[source]
Return log prob of pred under the distribution generated from x (obs features). Also return entropy of the generated distribution.
- Parameters
x (torch.tensor) – obs feature map obtained from a policy_net.
pred (torch.tensor) – Prediction to evaluate.
- Returns
logp (torch.tensor) – Log probability of pred according to the predicted distribution.
entropy_dist (torch.tensor) – Entropy of the predicted distribution.
dist (torch.Distribution) – Action probability distribution.
- forward(x, deterministic=False)[source]
Predict distribution parameters from x (obs features) and return predicted values (sampled and clipped), sampled log probability and distribution entropy.
- Parameters
x (torch.tensor) – Feature maps extracted from environment observations.
deterministic (bool) – Whether to randomly sample from predicted distribution or take the mode.
- Returns
pred (torch.tensor) – Predicted value.
clipped_pred (torch.tensor) – Predicted value (clipped to be within [-1, 1] range).
logp (torch.tensor) – Log probability of pred according to the predicted distribution.
entropy_dist (torch.tensor) – Entropy of the predicted distribution.
dist (torch.Distribution) – Action probability distribution.
- training: bool
- class pytorchrl.agent.actors.distributions.gaussian.DiagGaussianEnsemble(num_inputs: int, num_outputs: int, ensemble_size: int)[source]
Bases:
torch.nn.modules.module.ModuleEnsemble gaussian output layer
- Parameters
num_inputs (int) – Size of input feature maps.
num_outputs (int) – Output size of the gaussian layer.
ensemble_size (int) – Ensemble size in the output layer.
- training: bool
pytorchrl.agent.actors.distributions.squashed_gaussian module
- class pytorchrl.agent.actors.distributions.squashed_gaussian.SquashedGaussian(num_inputs, num_outputs, predict_log_std=True)[source]
Bases:
torch.nn.modules.module.ModuleSquashed Gaussian probability distribution.
- Parameters
num_inputs (int) – Size of input feature maps.
num_outputs (int) – Number of dims in output space.
predict_log_std (bool) – Whether to use a nn.linear layer to predict the output std.
- evaluate_pred(x, pred)[source]
Return log prob of pred under the distribution generated from x (obs features). Also return entropy of the generated distribution.
- Parameters
x (torch.tensor) – obs feature map obtained from a policy_net.
pred (torch.tensor) – Prediction to evaluate.
- Returns
logp (torch.tensor) – Log probability of pred according to the predicted distribution.
entropy_dist (torch.tensor) – Entropy of the predicted distribution.
dist (torch.Distribution) – Action probability distribution.
- forward(x, deterministic=False)[source]
Predict distribution parameters from x (obs features) and return predicted values (sampled and clipped), sampled log probability and distribution entropy.
- Parameters
x (torch.tensor) – Feature maps extracted from environment observations.
deterministic (bool) – Whether to randomly sample from predicted distribution or take the mode.
- Returns
pred (torch.tensor) – Predicted value.
clipped_pred (torch.tensor) – Predicted value (clipped to be within [-1, 1] range).
logp (torch.tensor) – Log probability of pred according to the predicted distribution.
entropy_dist (torch.tensor) – Entropy of the predicted distribution.
dist (torch.Distribution) – Action probability distribution.
- training: bool