pytorchrl.agent.env package

Subpackages

Submodules

pytorchrl.agent.env.env_wrappers module

class pytorchrl.agent.env.env_wrappers.TransposeImagesIfRequired(env=None, op=[2, 0, 1])[source]

Bases: gym.core.ObservationWrapper

When environment observations are images, this wrapper transposes the axis. It is useful when the images have shape (W,H,C), as they can be transposed “on the fly” to (C,W,H) for PyTorch convolutions to be applied.

Parameters
  • env (gym.Env) – Original Gym environment, previous to applying the wrapper.

  • op (list) – New axis ordering.

observation(ob)[source]

Transpose observation

pytorchrl.agent.env.make_env module

pytorchrl.agent.env.make_env.make_env(env_fn, env_kwargs, index_col_worker, index_grad_worker, index_env, log_dir=None, info_keywords=(), mode='train')[source]

Returns a function that handles the creating of a single environment, so it can be executed in an independent thread.

Parameters
  • env_fn (func) – Function to create the environment.

  • env_kwargs (dict) – keyword arguments of env_fn.

  • log_dir (str) – Target path for bench.Monitor logger values.

  • info_keywords (tuple) – Information keywords to be logged stored by bench.Monitor.

  • index_col_worker – Index of the data collection worker running this environment.

  • index_grad_worker (int) – Index of the gradient worker running this environment.

  • index_env (int) – Index of this environment withing the vector of environments.

  • mode (str) – “train” or “test”

Returns

_thunk – A function to create and return the environment. It also sets up the Monitor logging and used a TransposeImage wrapper if environment obs are images.

Return type

func

pytorchrl.agent.env.vec_env module

class pytorchrl.agent.env.vec_env.VecEnv[source]

Bases: object

Class to handle creation of environment vectors

classmethod create_factory(env_fn, env_kwargs={}, vec_env_size=1, log_dir=None, info_keywords=())[source]

Returns a function to create a vector of environments of size num_processes, so it can be executed by any worker, remote or not.

Parameters
  • env_fn (func) – Function to create the environment.

  • env_kwargs (dict) – keyword arguments of env_fn.

  • vec_env_size (int) – size of the vector of environments.

  • log_dir (str) – Target path for envs to log information through bench.Monitor class.

  • info_keywords (tuple) – Information keywords to be logged stored by bench.Monitor class.

Returns

  • make_vec_env (func) – Function to create a vector of environments.

  • dummy_env.action_space (gym.Space) – Environments action space.

  • dummy_env.observation_space (gym.Space) – Environments observation space.

pytorchrl.agent.env.vector_wrappers module

class pytorchrl.agent.env.vector_wrappers.VecPyTorch(venv, device)[source]

Bases: pytorchrl.agent.env.openai_baselines_dependencies.vec_envs.vec_env_base.VecEnvWrapper

This wrapper turns obs, reward’s and done’s from numpy arrays to pytorch tensors and places them in the specified device, facilitating interaction between the environment and the actor critic function approximators (NNs).

Parameters
  • venv (VecEnv) – Original vector environment, previous to applying the wrapper.

  • device (torch.device) – CPU or specific GPU where obs, reward’s and done’s are placed after being transformed into pytorch tensors.

device

CPU or specific GPU where obs, reward’s and done’s are placed after being transformed into pytorch tensors.

Type

torch.device

num_envs

Size of vector environment.

Type

int

reset()[source]

New vec env reset function

reset_single_env(env_id)[source]

Reset only one environment of the vector.

step_async(actions)[source]

New vec env step_async function

step_wait()[source]

New vec env step_wait function

Module contents