pytorchrl.agent.storages package
Subpackages
- pytorchrl.agent.storages.model_based package
- pytorchrl.agent.storages.off_policy package
- Submodules
- pytorchrl.agent.storages.off_policy.ere_buffer module
- pytorchrl.agent.storages.off_policy.her_buffer module
- pytorchrl.agent.storages.off_policy.nstep_buffer module
- pytorchrl.agent.storages.off_policy.per_buffer module
- pytorchrl.agent.storages.off_policy.replay_buffer module
- Module contents
- pytorchrl.agent.storages.on_policy package
Submodules
pytorchrl.agent.storages.base module
- class pytorchrl.agent.storages.base.Storage(size, device, actor, algorithm, *args)[source]
Bases:
abc.ABCBase class for all storage components. It should serve as a template to create new Storage classes with new or extended features.
- abstract after_gradients(actor, algo, info, *args)[source]
Steps required after updating actor policy model
- Parameters
actor (Actor class) – An actor class instance.
algo (Algo class) – An algorithm class instance.
info (dict) – Additional relevant info from gradient computation.
- Returns
info – info dict updated with relevant info from Storage.
- Return type
dict
- abstract before_gradients(actor, algo, *args)[source]
Steps required before updating actor policy model.
- Parameters
actor (Actor class) – An actor class instance.
algo (Algo class) – An algorithm class instance.
- abstract classmethod create_factory(size, *args)[source]
Returns a function to create new Storage instances.
- Parameters
size (int) – Storage capacity along time axis.
- abstract generate_batches(num_mini_batch, mini_batch_size, num_epochs=1, *args)[source]
Returns a batch iterator to update actor critic.
- Parameters
num_mini_batch (int) – Number mini batches per epoch.
mini_batch_size (int) – Number of samples contained in each mini batch.
num_epochs (int) – Number of epochs.
shuffle (bool) – Whether to shuffle collected data or generate sequential
- Yields
batch (dict) – Generated data batches.
- abstract get_all_buffer_data(data_to_cpu=False, *args)[source]
Return all currently stored data. If data_to_cpu, moves data tensors to cpu memory.
- abstract init_tensors(sample, *args)[source]
Lazy initialization of data tensors from a sample.
- Parameters
sample (dict) – Data sample (containing all tensors of an environment transition)
- abstract insert_data_slice(new_data, *args)[source]
Add new_data to the buffer stored data.
- Parameters
new_data (dict) – Dictionary of env transition samples to replace self.data with.
- abstract insert_transition(sample, *args)[source]
Store new transition sample.
- Parameters
sample (dict) – Data sample (containing all tensors of an environment transition)
- abstract update_storage_parameter(parameter_name, new_parameter_value, *args)[source]
If parameter_name is an attribute of the algorithm, change its value to new_parameter_value value.
- Parameters
parameter_name (str) – Attribute name
new_parameter_value (int or float) – New value for parameter_name.