pytorchrl.agent.env.openai_baselines_dependencies.vec_envs package

Submodules

pytorchrl.agent.env.openai_baselines_dependencies.vec_envs.dummy_vec_env module

class pytorchrl.agent.env.openai_baselines_dependencies.vec_envs.dummy_vec_env.DummyVecEnv(env_fns)[source]

Bases: pytorchrl.agent.env.openai_baselines_dependencies.vec_envs.vec_env_base.VecEnvBase

VecEnv that does runs multiple environments sequentially, that is, the step and reset commands are send to one environment at a time. Useful when debugging and when num_env == 1 (in the latter case, avoids communication overhead)

get_images()[source]

Return RGB images from each environment

render(mode='human')[source]
reset()[source]

Reset all the environments and return an array of observations, or a dict of observation arrays. If step_async is still doing work, that work will be cancelled and step_wait() should not be called until step_async() is invoked again.

reset_single_env(num_env)[source]
step_async(actions)[source]

Tell all the environments to start taking a step with the given actions. Call step_wait() to get the results of the step. You should not call this if a step_async run is already pending.

step_wait()[source]

Wait for the step taken with step_async(). Returns (obs, rews, dones, infos):

  • obs: an array of observations, or a dict of

    arrays of observations.

  • rews: an array of rewards

  • dones: an array of “episode done” booleans

  • infos: a sequence of info objects

pytorchrl.agent.env.openai_baselines_dependencies.vec_envs.subproc_vec_env module

class pytorchrl.agent.env.openai_baselines_dependencies.vec_envs.subproc_vec_env.SubprocVecEnv(env_fns, spaces=None, context='spawn', in_series=1)[source]

Bases: pytorchrl.agent.env.openai_baselines_dependencies.vec_envs.vec_env_base.VecEnvBase

VecEnv that runs multiple environments in parallel in subproceses and communicates with them via pipes. Recommended to use when num_envs > 1 and step() can be a bottleneck.

close_extras()[source]

Clean up the extra resources, beyond what’s in this base class. Only runs when not self.closed.

get_images()[source]

Return RGB images from each environment

reset()[source]

Reset all the environments and return an array of observations, or a dict of observation arrays. If step_async is still doing work, that work will be cancelled and step_wait() should not be called until step_async() is invoked again.

reset_single_env(num_env)[source]
step_async(actions)[source]

Tell all the environments to start taking a step with the given actions. Call step_wait() to get the results of the step. You should not call this if a step_async run is already pending.

step_wait()[source]

Wait for the step taken with step_async(). Returns (obs, rews, dones, infos):

  • obs: an array of observations, or a dict of

    arrays of observations.

  • rews: an array of rewards

  • dones: an array of “episode done” booleans

  • infos: a sequence of info objects

pytorchrl.agent.env.openai_baselines_dependencies.vec_envs.subproc_vec_env.worker(remote, parent_remote, env_fn_wrappers)[source]

pytorchrl.agent.env.openai_baselines_dependencies.vec_envs.util module

Helpers for dealing with vectorized environments.

pytorchrl.agent.env.openai_baselines_dependencies.vec_envs.util.copy_obs_dict(obs)[source]

Deep-copy an observation dict.

pytorchrl.agent.env.openai_baselines_dependencies.vec_envs.util.dict_to_obs(obs_dict)[source]

Convert an observation dict into a raw array if the original observation space was not a Dict space.

pytorchrl.agent.env.openai_baselines_dependencies.vec_envs.util.obs_space_info(obs_space)[source]

Get dict-structured information about a gym.Space. :returns: keys: a list of dict keys.

shapes: a dict mapping keys to shapes. dtypes: a dict mapping keys to dtypes.

Return type

A tuple (keys, shapes, dtypes)

pytorchrl.agent.env.openai_baselines_dependencies.vec_envs.util.obs_to_dict(obs)[source]

Convert an observation into a dict.

pytorchrl.agent.env.openai_baselines_dependencies.vec_envs.util.tile_images(img_nhwc)[source]

Tile N images into one big PxQ image (P,Q) are chosen to be as close as possible, and if N is square, then P=Q. input: img_nhwc, list or array of images, ndim=4 once turned into array

n = batch index, h = height, w = width, c = channel

Returns

bigim_HWc, ndarray with ndim=3

pytorchrl.agent.env.openai_baselines_dependencies.vec_envs.vec_env_base module

class pytorchrl.agent.env.openai_baselines_dependencies.vec_envs.vec_env_base.CloudpickleWrapper(x)[source]

Bases: object

Uses cloudpickle to serialize contents (otherwise multiprocessing tries to use pickle)

class pytorchrl.agent.env.openai_baselines_dependencies.vec_envs.vec_env_base.VecEnvBase(num_envs, observation_space, action_space)[source]

Bases: abc.ABC

An abstract asynchronous, vectorized environment. Used to batch data from multiple copies of an environment, so that each observation becomes an batch of observations, and expected action is a batch of actions to be applied per-environment.

close()[source]
close_extras()[source]

Clean up the extra resources, beyond what’s in this base class. Only runs when not self.closed.

closed = False
get_images()[source]

Return RGB images from each environment

get_viewer()[source]
metadata = {'render.modes': ['human', 'rgb_array']}
render(mode='human')[source]
abstract reset()[source]

Reset all the environments and return an array of observations, or a dict of observation arrays. If step_async is still doing work, that work will be cancelled and step_wait() should not be called until step_async() is invoked again.

step(actions)[source]

Step the environments synchronously. This is available for backwards compatibility.

abstract step_async(actions)[source]

Tell all the environments to start taking a step with the given actions. Call step_wait() to get the results of the step. You should not call this if a step_async run is already pending.

abstract step_wait()[source]

Wait for the step taken with step_async(). Returns (obs, rews, dones, infos):

  • obs: an array of observations, or a dict of

    arrays of observations.

  • rews: an array of rewards

  • dones: an array of “episode done” booleans

  • infos: a sequence of info objects

property unwrapped
viewer = None
class pytorchrl.agent.env.openai_baselines_dependencies.vec_envs.vec_env_base.VecEnvObservationWrapper(venv, observation_space=None, action_space=None)[source]

Bases: pytorchrl.agent.env.openai_baselines_dependencies.vec_envs.vec_env_base.VecEnvWrapper

abstract process(obs)[source]
reset()[source]

Reset all the environments and return an array of observations, or a dict of observation arrays. If step_async is still doing work, that work will be cancelled and step_wait() should not be called until step_async() is invoked again.

step_wait()[source]

Wait for the step taken with step_async(). Returns (obs, rews, dones, infos):

  • obs: an array of observations, or a dict of

    arrays of observations.

  • rews: an array of rewards

  • dones: an array of “episode done” booleans

  • infos: a sequence of info objects

class pytorchrl.agent.env.openai_baselines_dependencies.vec_envs.vec_env_base.VecEnvWrapper(venv, observation_space=None, action_space=None)[source]

Bases: pytorchrl.agent.env.openai_baselines_dependencies.vec_envs.vec_env_base.VecEnvBase

An environment wrapper that applies to an entire batch of environments at once.

close()[source]
get_images()[source]

Return RGB images from each environment

render(mode='human')[source]
abstract reset()[source]

Reset all the environments and return an array of observations, or a dict of observation arrays. If step_async is still doing work, that work will be cancelled and step_wait() should not be called until step_async() is invoked again.

step_async(actions)[source]

Tell all the environments to start taking a step with the given actions. Call step_wait() to get the results of the step. You should not call this if a step_async run is already pending.

abstract step_wait()[source]

Wait for the step taken with step_async(). Returns (obs, rews, dones, infos):

  • obs: an array of observations, or a dict of

    arrays of observations.

  • rews: an array of rewards

  • dones: an array of “episode done” booleans

  • infos: a sequence of info objects

pytorchrl.agent.env.openai_baselines_dependencies.vec_envs.vec_env_base.clear_mpi_env_vars()[source]

from mpi4py import MPI will call MPI_Init by default. If the child process has MPI environment variables, MPI will think that the child process is an MPI process just like the parent and do bad things such as hang. This context manager is a hacky way to clear those environment variables temporarily such as when we are starting multiprocessing Processes.

Module contents