pytorchrl.scheme.gradients package

Submodules

pytorchrl.scheme.gradients.g_worker module

class pytorchrl.scheme.gradients.g_worker.CollectorThread(index_worker, local_worker, remote_workers, col_fraction_workers=1.0, col_communication='synchronous', col_execution='Central', broadcast_interval=1)[source]

Bases: threading.Thread

This class receives data samples from the data collection workers and queues them into the data input_queue.

Parameters

index_worker (int) – Index assigned to this worker.
input_queue (queue.Queue) – Queue to store the data dicts received from data collection workers.
local_worker (Worker) – Local worker that acts as a parameter server.
remote_workers (list of Workers) – Set of workers collecting and sending rollouts.
col_fraction_workers (float) – Minimum fraction of samples required to stop if collection is synchronously coordinated and most workers have finished their collection task.
col_communication (str) – Communication coordination pattern for data collection.
col_execution (str) – Execution patterns for data collection.
broadcast_interval (int) – After how many central updates, model weights should be broadcasted to remote collection workers.

stopped

Whether or not the thread in running.

Type: bool

queue

Queue to store the data dicts received from data collection workers.

Type: queue.Queue

index_worker

Index assigned to this worker.

Type: int

local_worker

col_workers local worker.

Type: CWorker

remote_workers

col_workers remote data collection workers.

Type: List of CWorker’s

num_workers

Number of collection remote workers.

Type: int

broadcast_interval

After how many collection step model weights should be broadcasted to remote collection workers.

Type: int

num_sent_since_broadcast

Number of data dicts received since last model weights were broadcasted.

Type: int

broadcast_new_weights()[source]: Broadcast a new set of weights from the local worker.

run()[source]

Method representing the thread’s activity.

You may override this method in a subclass. The standard run() method invokes the callable object passed to the object’s constructor as the target argument, if any, with sequential and keyword arguments taken from the args and kwargs arguments, respectively.

should_broadcast()[source]: Returns whether broadcast() should be called to update weights.

step()[source]: Collects data from remote workers and puts it in the GWorker queue.

class pytorchrl.scheme.gradients.g_worker.GWorker(index_worker, col_workers_factory, col_communication='synchronous', compress_grads_to_send=False, col_execution='Central', col_fraction_workers=1.0, initial_weights=None, device=None)[source]

Bases: pytorchrl.scheme.base.worker.Worker

Worker class handling gradient computation.

This class wraps an actor instance, a storage class instance and a worker set of remote data collection workers. It receives data from the collection workers and computes gradients following a logic defined in function self.step(), which will be called from the Learner class.

Parameters

index_worker (int) – Worker index.
col_workers_factory (func) – A function that creates a set of data collection workers.
col_communication (str) – Communication coordination pattern for data collection.
compress_grads_to_send (bool) – Whether or not to compress gradients before sending then to the update worker.
col_execution (str) – Execution patterns for data collection.
col_fraction_workers (float) – Minimum fraction of samples required to stop if collection is synchronously coordinated and most workers have finished their collection task.
device (str) – “cpu” or specific GPU “cuda:number`” to use for computation.
initial_weights (ray object ID) – Initial model weights.

index_worker

Index assigned to this worker.

Type: int

iter

Number of times gradients have been computed and sent.

Type: int

col_communication

Communication coordination pattern for data collection.

Type: str

compress_grads_to_send

Whether or not to compress gradients before sending then to the update worker.

Type: bool

col_workers

A CWorkerSet class instance.

Type: CWorkerSet

local_worker

col_workers local worker.

Type: CWorker

remote_workers

col_workers remote data collection workers.

Type: List of CWorker’s

actor

An actor class instance.

Type: Actor

algo

An algorithm class instance.

Type: Algo

storage

A Storage class instance.

Type: Storage

inqueue

Input queue where incoming collected samples are placed.

Type: queue.Queue

collector

Class handling data collection via col_workers and placing incoming rollouts into the input queue inqueue.

Type: CollectorThread

property actor_version: Number of times Actor has been updated.

apply_gradients(gradients=None)[source]: Update Actor Critic model

compute_gradients(batch, distribute_gradients)[source]

Calculate actor gradients and update networks.

Parameters

batch (dict) – data batch containing all required tensors to compute algo loss.
distribute_gradients (bool) – If True, gradients will be directly shared across remote workers and optimization steps will executed in a decentralised way.

Returns

grads (list of tensors) – List of actor gradients.
info (dict) – Summary dict with relevant gradient-related information.

get_data()[source]: Pulls data from self.collector.queue and prepares batches to compute gradients.

get_grads(distribute_gradients=False)[source]

Perform a gradient computation step.

Parameters

distribute_gradients (bool) – If True, gradients will be directly shared across remote workers and optimization steps will executed in a decentralised way.

Returns

grads (list of tensors) – List of actor gradients.
info (dict) – Summary dict of relevant gradient operation information.

save_model(fname)[source]

Save current version of actor as a torch loadable checkpoint.

Parameters: fname (str) – Filename given to the checkpoint.
Returns: save_name – Path to saved file.
Return type: str

set_weights(actor_weights)[source]

Update the worker actor version with provided weights.

weightsdict of tensors: Dict containing actor weights to be set.

step(distribute_gradients=False)[source]

Pulls data from self.collector.queue, then perform a gradient computation step.

Parameters

distribute_gradients (bool) – If True, gradients will be directly shared across remote workers and optimization steps will executed in a decentralised way.

Returns

grads (list of tensors) – List of actor gradients.
info (dict) – Summary dict of relevant gradient operation information.

stop()[source]: Stop collecting data.

update_algorithm_parameter(parameter_name, new_parameter_value)[source]

If parameter_name is an attribute of Worker.algo, change its value to new_parameter_value value.

Parameters: parameter_name (str) – Algorithm attribute name

pytorchrl.scheme.gradients.g_worker_set module

class pytorchrl.scheme.gradients.g_worker_set.GWorkerSet(num_workers, index_parent, local_device, col_execution, col_communication, col_workers_factory, col_fraction_workers, grad_worker_resources, compress_grads_to_send=False)[source]

Bases: pytorchrl.scheme.base.worker_set.WorkerSet

Class to better handle the operations of ensembles of GWorkers.

Parameters

num_workers (int) – Number of remote workers in the worker set.
index_parent (int) – Worker index of parent gradient worker.
local_device (str) – “cpu” or specific GPU “cuda:number`” to use for computation.
col_execution (str) – Execution patterns for data collection.
col_communication (str) – Communication coordination pattern for data collection.
col_workers_factory (func) – A function that creates a set of data collection workers.
compress_grads_to_send (bool) – Whether or not to compress gradients before sending then to the update worker.
col_fraction_workers (float) – Minimum fraction of samples required to stop if collection is synchronously coordinated and most workers have finished their collection task.
grad_worker_resources (dict) – Ray resource specs for the remote workers.
initial_weights (ray object ID) – Initial model weights.

worker_class

Worker class to be instantiated to create Ray remote actors.

Type: python class

remote_config

Ray resource specs for the remote workers.

Type: dict

worker_params

Keyword arguments of the worker_class.

Type: dict

num_workers

Number of remote workers in the worker set.

Type: int

classmethod create_factory(num_workers, col_workers_factory, col_fraction_workers=1.0, col_execution='Central', col_communication='synchronous', compress_grads_to_send=False, grad_worker_resources={'memory': 5368709120, 'num_cpus': 1, 'num_gpus': 0.2, 'object_store_memory': 2147483648})[source]

Returns a function to create new CWorkerSet instances.

Parameters

num_workers (int) – Number of remote workers in the worker set.
col_execution (str) – Execution patterns for data collection.
col_communication (str) – Communication coordination pattern for data collection.
col_workers_factory (func) – A function that creates a set of data collection workers.
col_fraction_workers (float) – Minimum fraction of samples required to stop if collection is synchronously coordinated and most workers have finished their collection task.
compress_grads_to_send (bool) – Whether or not to compress gradients before sending then to the update worker.
grad_worker_resources (dict) – Ray resource specs for the remote workers.

Returns

grad_worker_set_factory – creates a new GWorkerSet class instance.

Return type

func

pytorchrl.scheme.gradients package

Submodules

pytorchrl.scheme.gradients.g_worker module

pytorchrl.scheme.gradients.g_worker_set module

Module contents