cambrian.envs.maze_env¶
Defines the MjCambrianMazeEnv class.
Classes¶
Enum representing different states in a grid. |
|
Defines a map config. Used for type hinting. |
|
mazes (Dict[str, MjCambrianMazeEnvConfig]): The configs for the mazes. Each |
|
A MjCambrianEnv defines a gymnasium environment that's based off mujoco. |
|
The maze class. Generates a maze from a given map and provides utility |
|
This is a simple class to store a collection of mazes. |
Module Contents¶
- class MjCambrianMapEntity(*args, **kwds)[source]¶
Bases:
enum.Enum
Enum representing different states in a grid.
- Variables:
RESET (str) – Initial reset position of the agent. Can include agent IDs in the format “R:<agent id>”.
WALL (str) – Represents a wall in the grid. Can include texture IDs in the format “1:<texture id>”.
EMPTY (str) – Represents an empty space in the grid.
- class MjCambrianMazeConfig[source]¶
Bases:
hydra_config.HydraContainerConfig
Defines a map config. Used for type hinting.
- Variables:
xml (MjCambrianXML) – The xml for the maze. This is the xml that will be used to create the maze.
map (str) – The map to use for the maze. It’s a 2D array where each element is a string and corresponds to a “pixel” in the map. See maze.py for info on what different strings mean. This is actually a List[List[str]], but we keep it as a string for readability when dumping the config to a file. Will convert to list when creating the maze.
scale (float) – The maze scaling for the continuous coordinates in the MuJoCo simulation.
height (float) – The height of the walls in the MuJoCo simulation.
hflip (bool) – Whether to flip the maze or not. If True, the maze will be flipped along the x-axis.
vflip (bool) – Whether to flip the maze or not. If True, the maze will be flipped along the y-axis.
rotation (float) – The rotation of the maze in degrees. The rotation is applied after the flip.
wall_texture_map (Dict[str, List[str]]) – The mapping from texture id to texture names. Textures in the list are chosen at random. If the list is of length 1, only one texture will be used. A length >= 1 is required. The keyword “default” is required for walls denoted simply as 1 or W. Other walls are specified as 1/W:<texture id>.
agent_id_map (Dict[str, List[str]]) – The mapping from agent id to agent names. Agents in the list are chosen at random. If the list is of length 1, only one agent will be used. A length >= 1 is required for each agent name. Effectively, this means you can set a reset position as R:<agent id> in the map and this map is used to assign to a group of agents. For instance, R:O in the map and including O: [agent1, agent2] in the agent_id_map will assign the reset position to either agent1 or agent2 at random. “default” is required for agents denoted simply as R.
enabled (bool) – Whether the maze is enabled or not.
- class MjCambrianMazeEnvConfig[source]¶
Bases:
cambrian.envs.env.MjCambrianEnvConfig
- mazes (Dict[str, MjCambrianMazeEnvConfig]): The configs for the mazes. Each
maze will be loaded into the scene and the agent will be placed in a maze at each reset.
- maze_selection_fn (MjCambrianMazeSelectionFn): The function to use to select
the maze. The function will be called at each reset to select the maze to use. See MjCambrianMazeSelectionFn and maze.py for more info.
- class MjCambrianMazeEnv(config, **kwargs)[source]¶
Bases:
cambrian.envs.env.MjCambrianEnv
A MjCambrianEnv defines a gymnasium environment that’s based off mujoco.
NOTES: - This is an overridden version of the MujocoEnv class. The two main differences is that we allow for /reset multiple agents and use our own custom renderer. It also reduces the need to create temporary xml files which MujocoEnv had to load. It’s essentially a copy of MujocoEnv with the two aforementioned major changes.
- Parameters:
config (MjCambrianEnvConfig) – The config object.
name (Optional[str]) – The name of the environment. This is added as an overlay to the renderer.
- reset(*, seed=None, options=None)[source]¶
Reset the environment.
Will reset all underlying components (the maze, the agents, etc.). The simulation will then be stepped once to ensure that the observations are up-to-date.
- Returns:
Tuple[ObsType, InfoType] –
- The observations for each
agent and the info dict for each agent.
- property maze: MjCambrianMaze[source]¶
Returns the current maze.
- property maze_store: MjCambrianMazeStore[source]¶
Returns the maze store.
- class MjCambrianMaze(config, name)[source]¶
The maze class. Generates a maze from a given map and provides utility functions for working with the maze.
- reset(spec, *, reset_occupied=True)[source]¶
Resets the maze. Will reset the wall textures and reset the occupied locations, if desired.
- compute_optimal_path(start, target, *, obstacles=[])[source]¶
Computes the optimal path from the start position to the target.
Uses a BFS to find the shortest path.
- Keyword Arguments:
obstacles (List[Tuple[int, int]]) – The obstacles in the maze. Each obstacle is a tuple of (row, col). Defaults to []. Avoids these positions when computing the path.
- generate_reset_pos(agent, *, add_as_occupied=True)[source]¶
Generates a random reset position for an agent.
- Keyword Arguments:
add_as_occupied (bool) – Whether to add the chosen location to the occupied locations. Defaults to True.
- Returns:
np.ndarray – The chosen position. Is of size (2,).
- property config: MjCambrianMazeEnvConfig[source]¶
Returns the config.
- class MjCambrianMazeStore(maze_configs, maze_selection_fn)[source]¶
This is a simple class to store a collection of mazes.
- property current_maze: MjCambrianMaze[source]¶
Returns the current maze.
- property maze_list: List[MjCambrianMaze][source]¶
Returns the list of mazes.
- select_maze_schedule(env, *, schedule='linear', total_timesteps, n_envs, lam_0=-2.0, lam_n=2.0)[source]¶
Selects a maze based on a schedule. The scheduled selections are based on the order of the mazes in the list.
- Keyword Arguments:
schedule (Optional[str]) – The schedule to use. One of “linear”, “exponential”, or “logistic”. Defaults to “linear”.
total_timesteps (int) – The total number of timesteps in the training schedule. Unused if schedule is None. Required otherwise.
n_envs (int) – The number of environments. Unused if schedule is None. Required otherwise.
lam_0 (Optional[float]) – The lambda value at the start of the schedule. Unused if schedule is None.
lam_n (Optional[float]) – The lambda value at the end of the schedule. Unused if schedule is None.