pytorchrl.agent.env package

Subpackages

pytorchrl.agent.env.openai_baselines_dependencies package

Submodules

pytorchrl.agent.env.env_wrappers module

class pytorchrl.agent.env.env_wrappers.TransposeImagesIfRequired(env=None, op=[2, 0, 1])[source]

Bases: gym.core.ObservationWrapper

When environment observations are images, this wrapper transposes the axis. It is useful when the images have shape (W,H,C), as they can be transposed “on the fly” to (C,W,H) for PyTorch convolutions to be applied.

Parameters

env (gym.Env) – Original Gym environment, previous to applying the wrapper.
op (list) – New axis ordering.

observation(ob)[source]: Transpose observation

pytorchrl.agent.env.make_env module

pytorchrl.agent.env.make_env.make_env(env_fn, env_kwargs, index_col_worker, index_grad_worker, index_env, log_dir=None, info_keywords=(), mode='train')[source]

Returns a function that handles the creating of a single environment, so it can be executed in an independent thread.

Parameters

env_fn (func) – Function to create the environment.
env_kwargs (dict) – keyword arguments of env_fn.
log_dir (str) – Target path for bench.Monitor logger values.
info_keywords (tuple) – Information keywords to be logged stored by bench.Monitor.
index_col_worker – Index of the data collection worker running this environment.
index_grad_worker (int) – Index of the gradient worker running this environment.
index_env (int) – Index of this environment withing the vector of environments.
mode (str) – “train” or “test”

Returns

_thunk – A function to create and return the environment. It also sets up the Monitor logging and used a TransposeImage wrapper if environment obs are images.

Return type

func

pytorchrl.agent.env.vec_env module

class pytorchrl.agent.env.vec_env.VecEnv[source]

Bases: object

Class to handle creation of environment vectors

classmethod create_factory(env_fn, env_kwargs={}, vec_env_size=1, log_dir=None, info_keywords=())[source]

Returns a function to create a vector of environments of size num_processes, so it can be executed by any worker, remote or not.

Parameters

env_fn (func) – Function to create the environment.
env_kwargs (dict) – keyword arguments of env_fn.
vec_env_size (int) – size of the vector of environments.
log_dir (str) – Target path for envs to log information through bench.Monitor class.
info_keywords (tuple) – Information keywords to be logged stored by bench.Monitor class.

Returns

make_vec_env (func) – Function to create a vector of environments.
dummy_env.action_space (gym.Space) – Environments action space.
dummy_env.observation_space (gym.Space) – Environments observation space.

pytorchrl.agent.env.vector_wrappers module

class pytorchrl.agent.env.vector_wrappers.VecPyTorch(venv, device)[source]

Bases: pytorchrl.agent.env.openai_baselines_dependencies.vec_envs.vec_env_base.VecEnvWrapper

This wrapper turns obs, reward’s and done’s from numpy arrays to pytorch tensors and places them in the specified device, facilitating interaction between the environment and the actor critic function approximators (NNs).

Parameters

venv (VecEnv) – Original vector environment, previous to applying the wrapper.
device (torch.device) – CPU or specific GPU where obs, reward’s and done’s are placed after being transformed into pytorch tensors.

device

CPU or specific GPU where obs, reward’s and done’s are placed after being transformed into pytorch tensors.

Type: torch.device

num_envs

Size of vector environment.

Type: int

reset()[source]: New vec env reset function

reset_single_env(env_id)[source]: Reset only one environment of the vector.

step_async(actions)[source]: New vec env step_async function

step_wait()[source]: New vec env step_wait function

pytorchrl.agent.env package

Subpackages

Submodules

pytorchrl.agent.env.env_wrappers module

pytorchrl.agent.env.make_env module

pytorchrl.agent.env.vec_env module

pytorchrl.agent.env.vector_wrappers module

Module contents