pytorchrl.scheme package
Subpackages
Submodules
pytorchrl.scheme.scheme module
- class pytorchrl.scheme.scheme.Scheme(algo_factory, actor_factory, storage_factory, train_envs_factory, test_envs_factory=<function Scheme.<lambda>>, num_col_workers=1, col_compress_data=False, col_workers_communication='synchronous', col_workers_resources={'num_cpus': 1, 'num_gpus': 0.5}, col_preemption_thresholds={'fraction_samples': 1.0, 'fraction_workers': 1.0}, num_grad_workers=1, grad_compress_data=False, grad_workers_communication='synchronous', grad_workers_resources={'num_cpus': 1, 'num_gpus': 0.5}, local_device=None, decentralized_update_execution=False)[source]
Bases:
objectClass to define training schemes and handle creation and operation of its workers.
- Parameters
algo_factory (func) – A function that creates an algorithm class.
actor_factory (func) – A function that creates a policy.
storage_factory (func) – A function that create a rollouts storage.
train_envs_factory (func) – A function to create train environments.
test_envs_factory (func) – A function to create test environments.
num_col_workers (int) – Number of data collection workers per gradient worker.
col_workers_communication (str) – Communication coordination pattern for data collection.
col_workers_resources (dict) – Ray resource specs for collection remote workers.
col_preemption_thresholds (dict) – specs about minimum fraction_samples [0 - 1.0] and minimum fraction_workers [0 - 1.0] required in synchronous data collection.
num_grad_workers (int) – Number of gradient workers.
grad_workers_communication (str) – Communication coordination pattern for gradient computation workers.
grad_workers_resources (dict) – Ray resource specs for gradient remote workers.
local_device (str) – “cpu” or specific GPU “cuda:number” to use for computation.
decentralized_update_execution (bool) – Whether the gradients are applied in the update workers (central update) or broadcasted to all gradient workers for a decentralized update.
pytorchrl.scheme.utils module
- pytorchrl.scheme.utils.average_gradients(grads_list)[source]
Averages gradients coming from distributed workers.
- Parameters
grads_list (list of lists of tensors) – List of actor gradients from different workers.
- Returns
avg_grads – Averaged actor gradients.
- Return type
list of tensors
- pytorchrl.scheme.utils.pack(data)[source]
from https://github.com/ray-project/ray/blob/master/rllib/utils/compression.py
- pytorchrl.scheme.utils.ray_get_and_free(object_ids)[source]
Call ray.get and then queue the object ids for deletion. This function should be used whenever possible in RLlib, to optimize memory usage. The only exception is when an object_id is shared among multiple readers.
Adapted from https://github.com/ray-project/ray/blob/master/rllib/utils/memory.py
- Parameters
object_ids (ObjectID|List[ObjectID]) – Object ids to fetch and free.
- Returns
result – The result of ray.get(object_ids).
- Return type
python objects
- pytorchrl.scheme.utils.unpack(data)[source]
from https://github.com/ray-project/ray/blob/master/rllib/utils/compression.py