pytorchrl.scheme.gradients package
Submodules
pytorchrl.scheme.gradients.g_worker module
- class pytorchrl.scheme.gradients.g_worker.CollectorThread(index_worker, local_worker, remote_workers, col_fraction_workers=1.0, col_communication='synchronous', col_execution='Central', broadcast_interval=1)[source]
Bases:
threading.ThreadThis class receives data samples from the data collection workers and queues them into the data input_queue.
- Parameters
index_worker (int) – Index assigned to this worker.
input_queue (queue.Queue) – Queue to store the data dicts received from data collection workers.
local_worker (Worker) – Local worker that acts as a parameter server.
remote_workers (list of Workers) – Set of workers collecting and sending rollouts.
col_fraction_workers (float) – Minimum fraction of samples required to stop if collection is synchronously coordinated and most workers have finished their collection task.
col_communication (str) – Communication coordination pattern for data collection.
col_execution (str) – Execution patterns for data collection.
broadcast_interval (int) – After how many central updates, model weights should be broadcasted to remote collection workers.
- stopped
Whether or not the thread in running.
- Type
bool
- queue
Queue to store the data dicts received from data collection workers.
- Type
queue.Queue
- index_worker
Index assigned to this worker.
- Type
int
- remote_workers
col_workers remote data collection workers.
- Type
List of CWorker’s
- num_workers
Number of collection remote workers.
- Type
int
- broadcast_interval
After how many collection step model weights should be broadcasted to remote collection workers.
- Type
int
- num_sent_since_broadcast
Number of data dicts received since last model weights were broadcasted.
- Type
int
- run()[source]
Method representing the thread’s activity.
You may override this method in a subclass. The standard run() method invokes the callable object passed to the object’s constructor as the target argument, if any, with sequential and keyword arguments taken from the args and kwargs arguments, respectively.
- class pytorchrl.scheme.gradients.g_worker.GWorker(index_worker, col_workers_factory, col_communication='synchronous', compress_grads_to_send=False, col_execution='Central', col_fraction_workers=1.0, initial_weights=None, device=None)[source]
Bases:
pytorchrl.scheme.base.worker.WorkerWorker class handling gradient computation.
This class wraps an actor instance, a storage class instance and a worker set of remote data collection workers. It receives data from the collection workers and computes gradients following a logic defined in function self.step(), which will be called from the Learner class.
- Parameters
index_worker (int) – Worker index.
col_workers_factory (func) – A function that creates a set of data collection workers.
col_communication (str) – Communication coordination pattern for data collection.
compress_grads_to_send (bool) – Whether or not to compress gradients before sending then to the update worker.
col_execution (str) – Execution patterns for data collection.
col_fraction_workers (float) – Minimum fraction of samples required to stop if collection is synchronously coordinated and most workers have finished their collection task.
device (str) – “cpu” or specific GPU “cuda:number`” to use for computation.
initial_weights (ray object ID) – Initial model weights.
- index_worker
Index assigned to this worker.
- Type
int
- iter
Number of times gradients have been computed and sent.
- Type
int
- col_communication
Communication coordination pattern for data collection.
- Type
str
- compress_grads_to_send
Whether or not to compress gradients before sending then to the update worker.
- Type
bool
- col_workers
A CWorkerSet class instance.
- Type
- remote_workers
col_workers remote data collection workers.
- Type
List of CWorker’s
- algo
An algorithm class instance.
- Type
Algo
- inqueue
Input queue where incoming collected samples are placed.
- Type
queue.Queue
- collector
Class handling data collection via col_workers and placing incoming rollouts into the input queue inqueue.
- Type
- property actor_version
Number of times Actor has been updated.
- compute_gradients(batch, distribute_gradients)[source]
Calculate actor gradients and update networks.
- Parameters
batch (dict) – data batch containing all required tensors to compute algo loss.
distribute_gradients (bool) – If True, gradients will be directly shared across remote workers and optimization steps will executed in a decentralised way.
- Returns
grads (list of tensors) – List of actor gradients.
info (dict) – Summary dict with relevant gradient-related information.
- get_grads(distribute_gradients=False)[source]
Perform a gradient computation step.
- Parameters
distribute_gradients (bool) – If True, gradients will be directly shared across remote workers and optimization steps will executed in a decentralised way.
- Returns
grads (list of tensors) – List of actor gradients.
info (dict) – Summary dict of relevant gradient operation information.
- save_model(fname)[source]
Save current version of actor as a torch loadable checkpoint.
- Parameters
fname (str) – Filename given to the checkpoint.
- Returns
save_name – Path to saved file.
- Return type
str
- set_weights(actor_weights)[source]
Update the worker actor version with provided weights.
- weightsdict of tensors
Dict containing actor weights to be set.
- step(distribute_gradients=False)[source]
Pulls data from self.collector.queue, then perform a gradient computation step.
- Parameters
distribute_gradients (bool) – If True, gradients will be directly shared across remote workers and optimization steps will executed in a decentralised way.
- Returns
grads (list of tensors) – List of actor gradients.
info (dict) – Summary dict of relevant gradient operation information.
pytorchrl.scheme.gradients.g_worker_set module
- class pytorchrl.scheme.gradients.g_worker_set.GWorkerSet(num_workers, index_parent, local_device, col_execution, col_communication, col_workers_factory, col_fraction_workers, grad_worker_resources, compress_grads_to_send=False)[source]
Bases:
pytorchrl.scheme.base.worker_set.WorkerSetClass to better handle the operations of ensembles of GWorkers.
- Parameters
num_workers (int) – Number of remote workers in the worker set.
index_parent (int) – Worker index of parent gradient worker.
local_device (str) – “cpu” or specific GPU “cuda:number`” to use for computation.
col_execution (str) – Execution patterns for data collection.
col_communication (str) – Communication coordination pattern for data collection.
col_workers_factory (func) – A function that creates a set of data collection workers.
compress_grads_to_send (bool) – Whether or not to compress gradients before sending then to the update worker.
col_fraction_workers (float) – Minimum fraction of samples required to stop if collection is synchronously coordinated and most workers have finished their collection task.
grad_worker_resources (dict) – Ray resource specs for the remote workers.
initial_weights (ray object ID) – Initial model weights.
- worker_class
Worker class to be instantiated to create Ray remote actors.
- Type
python class
- remote_config
Ray resource specs for the remote workers.
- Type
dict
- worker_params
Keyword arguments of the worker_class.
- Type
dict
- num_workers
Number of remote workers in the worker set.
- Type
int
- classmethod create_factory(num_workers, col_workers_factory, col_fraction_workers=1.0, col_execution='Central', col_communication='synchronous', compress_grads_to_send=False, grad_worker_resources={'memory': 5368709120, 'num_cpus': 1, 'num_gpus': 0.2, 'object_store_memory': 2147483648})[source]
Returns a function to create new CWorkerSet instances.
- Parameters
num_workers (int) – Number of remote workers in the worker set.
col_execution (str) – Execution patterns for data collection.
col_communication (str) – Communication coordination pattern for data collection.
col_workers_factory (func) – A function that creates a set of data collection workers.
col_fraction_workers (float) – Minimum fraction of samples required to stop if collection is synchronously coordinated and most workers have finished their collection task.
compress_grads_to_send (bool) – Whether or not to compress gradients before sending then to the update worker.
grad_worker_resources (dict) – Ray resource specs for the remote workers.
- Returns
grad_worker_set_factory – creates a new GWorkerSet class instance.
- Return type
func