pytorchrl.scheme.base package

Submodules

pytorchrl.scheme.base.utils module

pytorchrl.scheme.base.utils.find_free_port()[source]

Returns a free port on the current node.

from https://github.com/ray-project/ray/blob/master/python/ray/util/sgd/utils.py

pytorchrl.scheme.base.worker module

class pytorchrl.scheme.base.worker.Worker(index_worker)[source]

Bases: object

Class containing common worker functionality.

Parameters

index_worker (int) – Worker index.

index_worker

Index assigned to this worker.

Type

int

actor

An actor class instance.

Type

nn.Module

classmethod as_remote(num_cpus=None, num_gpus=None, memory=None, object_store_memory=None, resources=None)[source]

Creates a Worker instance as a remote ray actor.

Parameters
  • num_cpus (int) – The quantity of CPU cores to reserve for this Worker class.

  • num_gpus (float) – The quantity of GPUs to reserve for this Worker class.

  • memory (int) – The heap memory quota for this actor (in bytes).

  • object_store_memory (int) – The object store memory quota for this actor (in bytes).

  • resources (Dict[str, float]) – The default resources required by the actor creation task.

Returns

W – A ray remote actor Worker class.

Return type

Worker

find_free_port()[source]

Returns a free port on the current node.

static get_host()[source]

Return node name where this Worker is being executed.

get_node_ip()[source]

Returns the IP address of the current node.

get_weights()[source]

Returns current actor.state_dict() weights

print_worker_info()[source]

Print information about this worker, including index and resources assigned

setup_torch_data_parallel(url, rank, world_size, backend)[source]

Join a torch process group for distributed SGD.

Parameters
  • url – URL specifying how to initialize the process group.

  • rank – Rank of the current process.

  • world_size (int) – Number of processes participating in the job.

  • backend (str) – The pytorch distributed backend to use. valid values include mpi, gloo, and nccl.

terminate_worker()[source]

Terminate this ray actor

pytorchrl.scheme.base.worker_set module

class pytorchrl.scheme.base.worker_set.WorkerSet(worker, worker_params, index_parent_worker, worker_remote_config={'memory': 5368709120, 'num_cpus': 1, 'num_gpus': 0.2, 'object_store_memory': 2147483648}, num_workers=1, local_device=None, initial_weights=None, add_local_worker=True, total_parent_workers=None)[source]

Bases: object

Class to better handle the operations of ensembles of Workers. Contains common functionality across all worker sets.

Parameters
  • worker (func) – A function that creates a worker class.

  • worker_params (dict) – Worker class kwargs.

  • worker_remote_config (dict) – Ray resource specs for the remote workers.

  • num_workers (int) – Num workers replicas in the worker_set.

  • add_local_worker (bool) – Whether or not to include have a non-remote worker in the worker set.

worker_class

Worker class to be instantiated to create Ray remote actors.

Type

python class

remote_config

Ray resource specs for the remote workers.

Type

dict

worker_params

Keyword arguments of the worker_class.

Type

dict

num_workers

Number of remote workers in the worker set.

Type

int

add_workers(num_workers)[source]

Create and add a number of remote workers to this worker set.

Parameters

num_workers (int) – Number of remote workers to create.

local_worker()[source]

Return local worker

remote_workers()[source]

Returns list of remote workers

stop()[source]

Stop all remote workers

Module contents