pytorchrl.scheme package

Subpackages

Submodules

pytorchrl.scheme.scheme module

class pytorchrl.scheme.scheme.Scheme(algo_factory, actor_factory, storage_factory, train_envs_factory, test_envs_factory=<function Scheme.<lambda>>, num_col_workers=1, col_compress_data=False, col_workers_communication='synchronous', col_workers_resources={'num_cpus': 1, 'num_gpus': 0.5}, col_preemption_thresholds={'fraction_samples': 1.0, 'fraction_workers': 1.0}, num_grad_workers=1, grad_compress_data=False, grad_workers_communication='synchronous', grad_workers_resources={'num_cpus': 1, 'num_gpus': 0.5}, local_device=None, decentralized_update_execution=False)[source]

Bases: object

Class to define training schemes and handle creation and operation of its workers.

Parameters
  • algo_factory (func) – A function that creates an algorithm class.

  • actor_factory (func) – A function that creates a policy.

  • storage_factory (func) – A function that create a rollouts storage.

  • train_envs_factory (func) – A function to create train environments.

  • test_envs_factory (func) – A function to create test environments.

  • num_col_workers (int) – Number of data collection workers per gradient worker.

  • col_workers_communication (str) – Communication coordination pattern for data collection.

  • col_workers_resources (dict) – Ray resource specs for collection remote workers.

  • col_preemption_thresholds (dict) – specs about minimum fraction_samples [0 - 1.0] and minimum fraction_workers [0 - 1.0] required in synchronous data collection.

  • num_grad_workers (int) – Number of gradient workers.

  • grad_workers_communication (str) – Communication coordination pattern for gradient computation workers.

  • grad_workers_resources (dict) – Ray resource specs for gradient remote workers.

  • local_device (str) – “cpu” or specific GPU “cuda:number” to use for computation.

  • decentralized_update_execution (bool) – Whether the gradients are applied in the update workers (central update) or broadcasted to all gradient workers for a decentralized update.

get_agent_components()[source]

Returns class names for each agent component.

update_worker()[source]

Returns local worker

pytorchrl.scheme.utils module

pytorchrl.scheme.utils.average_gradients(grads_list)[source]

Averages gradients coming from distributed workers.

Parameters

grads_list (list of lists of tensors) – List of actor gradients from different workers.

Returns

avg_grads – Averaged actor gradients.

Return type

list of tensors

pytorchrl.scheme.utils.broadcast_message(key, message)[source]
pytorchrl.scheme.utils.check_message(key)[source]
pytorchrl.scheme.utils.pack(data)[source]

from https://github.com/ray-project/ray/blob/master/rllib/utils/compression.py

pytorchrl.scheme.utils.ray_get_and_free(object_ids)[source]

Call ray.get and then queue the object ids for deletion. This function should be used whenever possible in RLlib, to optimize memory usage. The only exception is when an object_id is shared among multiple readers.

Adapted from https://github.com/ray-project/ray/blob/master/rllib/utils/memory.py

Parameters

object_ids (ObjectID|List[ObjectID]) – Object ids to fetch and free.

Returns

result – The result of ray.get(object_ids).

Return type

python objects

pytorchrl.scheme.utils.unpack(data)[source]

from https://github.com/ray-project/ray/blob/master/rllib/utils/compression.py

Module contents