pytorchrl.scheme.gradients package

Submodules

pytorchrl.scheme.gradients.g_worker module

class pytorchrl.scheme.gradients.g_worker.CollectorThread(index_worker, local_worker, remote_workers, col_fraction_workers=1.0, col_communication='synchronous', col_execution='Central', broadcast_interval=1)[source]

Bases: threading.Thread

This class receives data samples from the data collection workers and queues them into the data input_queue.

Parameters
  • index_worker (int) – Index assigned to this worker.

  • input_queue (queue.Queue) – Queue to store the data dicts received from data collection workers.

  • local_worker (Worker) – Local worker that acts as a parameter server.

  • remote_workers (list of Workers) – Set of workers collecting and sending rollouts.

  • col_fraction_workers (float) – Minimum fraction of samples required to stop if collection is synchronously coordinated and most workers have finished their collection task.

  • col_communication (str) – Communication coordination pattern for data collection.

  • col_execution (str) – Execution patterns for data collection.

  • broadcast_interval (int) – After how many central updates, model weights should be broadcasted to remote collection workers.

stopped

Whether or not the thread in running.

Type

bool

queue

Queue to store the data dicts received from data collection workers.

Type

queue.Queue

index_worker

Index assigned to this worker.

Type

int

local_worker

col_workers local worker.

Type

CWorker

remote_workers

col_workers remote data collection workers.

Type

List of CWorker’s

num_workers

Number of collection remote workers.

Type

int

broadcast_interval

After how many collection step model weights should be broadcasted to remote collection workers.

Type

int

num_sent_since_broadcast

Number of data dicts received since last model weights were broadcasted.

Type

int

broadcast_new_weights()[source]

Broadcast a new set of weights from the local worker.

run()[source]

Method representing the thread’s activity.

You may override this method in a subclass. The standard run() method invokes the callable object passed to the object’s constructor as the target argument, if any, with sequential and keyword arguments taken from the args and kwargs arguments, respectively.

should_broadcast()[source]

Returns whether broadcast() should be called to update weights.

step()[source]

Collects data from remote workers and puts it in the GWorker queue.

class pytorchrl.scheme.gradients.g_worker.GWorker(index_worker, col_workers_factory, col_communication='synchronous', compress_grads_to_send=False, col_execution='Central', col_fraction_workers=1.0, initial_weights=None, device=None)[source]

Bases: pytorchrl.scheme.base.worker.Worker

Worker class handling gradient computation.

This class wraps an actor instance, a storage class instance and a worker set of remote data collection workers. It receives data from the collection workers and computes gradients following a logic defined in function self.step(), which will be called from the Learner class.

Parameters
  • index_worker (int) – Worker index.

  • col_workers_factory (func) – A function that creates a set of data collection workers.

  • col_communication (str) – Communication coordination pattern for data collection.

  • compress_grads_to_send (bool) – Whether or not to compress gradients before sending then to the update worker.

  • col_execution (str) – Execution patterns for data collection.

  • col_fraction_workers (float) – Minimum fraction of samples required to stop if collection is synchronously coordinated and most workers have finished their collection task.

  • device (str) – “cpu” or specific GPU “cuda:number`” to use for computation.

  • initial_weights (ray object ID) – Initial model weights.

index_worker

Index assigned to this worker.

Type

int

iter

Number of times gradients have been computed and sent.

Type

int

col_communication

Communication coordination pattern for data collection.

Type

str

compress_grads_to_send

Whether or not to compress gradients before sending then to the update worker.

Type

bool

col_workers

A CWorkerSet class instance.

Type

CWorkerSet

local_worker

col_workers local worker.

Type

CWorker

remote_workers

col_workers remote data collection workers.

Type

List of CWorker’s

actor

An actor class instance.

Type

Actor

algo

An algorithm class instance.

Type

Algo

storage

A Storage class instance.

Type

Storage

inqueue

Input queue where incoming collected samples are placed.

Type

queue.Queue

collector

Class handling data collection via col_workers and placing incoming rollouts into the input queue inqueue.

Type

CollectorThread

property actor_version

Number of times Actor has been updated.

apply_gradients(gradients=None)[source]

Update Actor Critic model

compute_gradients(batch, distribute_gradients)[source]

Calculate actor gradients and update networks.

Parameters
  • batch (dict) – data batch containing all required tensors to compute algo loss.

  • distribute_gradients (bool) – If True, gradients will be directly shared across remote workers and optimization steps will executed in a decentralised way.

Returns

  • grads (list of tensors) – List of actor gradients.

  • info (dict) – Summary dict with relevant gradient-related information.

get_data()[source]

Pulls data from self.collector.queue and prepares batches to compute gradients.

get_grads(distribute_gradients=False)[source]

Perform a gradient computation step.

Parameters

distribute_gradients (bool) – If True, gradients will be directly shared across remote workers and optimization steps will executed in a decentralised way.

Returns

  • grads (list of tensors) – List of actor gradients.

  • info (dict) – Summary dict of relevant gradient operation information.

save_model(fname)[source]

Save current version of actor as a torch loadable checkpoint.

Parameters

fname (str) – Filename given to the checkpoint.

Returns

save_name – Path to saved file.

Return type

str

set_weights(actor_weights)[source]

Update the worker actor version with provided weights.

weightsdict of tensors

Dict containing actor weights to be set.

step(distribute_gradients=False)[source]

Pulls data from self.collector.queue, then perform a gradient computation step.

Parameters

distribute_gradients (bool) – If True, gradients will be directly shared across remote workers and optimization steps will executed in a decentralised way.

Returns

  • grads (list of tensors) – List of actor gradients.

  • info (dict) – Summary dict of relevant gradient operation information.

stop()[source]

Stop collecting data.

update_algorithm_parameter(parameter_name, new_parameter_value)[source]

If parameter_name is an attribute of Worker.algo, change its value to new_parameter_value value.

Parameters

parameter_name (str) – Algorithm attribute name

pytorchrl.scheme.gradients.g_worker_set module

class pytorchrl.scheme.gradients.g_worker_set.GWorkerSet(num_workers, index_parent, local_device, col_execution, col_communication, col_workers_factory, col_fraction_workers, grad_worker_resources, compress_grads_to_send=False)[source]

Bases: pytorchrl.scheme.base.worker_set.WorkerSet

Class to better handle the operations of ensembles of GWorkers.

Parameters
  • num_workers (int) – Number of remote workers in the worker set.

  • index_parent (int) – Worker index of parent gradient worker.

  • local_device (str) – “cpu” or specific GPU “cuda:number`” to use for computation.

  • col_execution (str) – Execution patterns for data collection.

  • col_communication (str) – Communication coordination pattern for data collection.

  • col_workers_factory (func) – A function that creates a set of data collection workers.

  • compress_grads_to_send (bool) – Whether or not to compress gradients before sending then to the update worker.

  • col_fraction_workers (float) – Minimum fraction of samples required to stop if collection is synchronously coordinated and most workers have finished their collection task.

  • grad_worker_resources (dict) – Ray resource specs for the remote workers.

  • initial_weights (ray object ID) – Initial model weights.

worker_class

Worker class to be instantiated to create Ray remote actors.

Type

python class

remote_config

Ray resource specs for the remote workers.

Type

dict

worker_params

Keyword arguments of the worker_class.

Type

dict

num_workers

Number of remote workers in the worker set.

Type

int

classmethod create_factory(num_workers, col_workers_factory, col_fraction_workers=1.0, col_execution='Central', col_communication='synchronous', compress_grads_to_send=False, grad_worker_resources={'memory': 5368709120, 'num_cpus': 1, 'num_gpus': 0.2, 'object_store_memory': 2147483648})[source]

Returns a function to create new CWorkerSet instances.

Parameters
  • num_workers (int) – Number of remote workers in the worker set.

  • col_execution (str) – Execution patterns for data collection.

  • col_communication (str) – Communication coordination pattern for data collection.

  • col_workers_factory (func) – A function that creates a set of data collection workers.

  • col_fraction_workers (float) – Minimum fraction of samples required to stop if collection is synchronously coordinated and most workers have finished their collection task.

  • compress_grads_to_send (bool) – Whether or not to compress gradients before sending then to the update worker.

  • grad_worker_resources (dict) – Ray resource specs for the remote workers.

Returns

grad_worker_set_factory – creates a new GWorkerSet class instance.

Return type

func

Module contents