Skip to content

MinigridSafe

MinigridSafe is a fork of Farama's Minigrid extended for safe RL research. It adds stochastic tile mechanics, adversarial NPCs, an ASCII-based environment builder, and automatic export to PRISM — making it directly usable as a benchmark suite for Tempest's shield synthesis pipeline.

All environments retain full Gymnasium API compatibility. Two agent modes are supported:

  • Cardinal — absolute N/S/E/W movement. No facing direction. Designed for stochastic environments and PRISM export.
  • Omnidirectional — egocentric rotation + forward movement. Maintains a facing direction.

Stochastic features and PRISM export are primarily designed around cardinal mode.

pip install git+https://github.com/isec-tugraz/MinigridSafe

Stochastic Behaviour: Slippery Tiles

Slippery tiles introduce probabilistic movement. When an agent steps onto a slippery tile, the resulting position is sampled from a distribution rather than being deterministic. Blocked outcomes (walls, lava, out-of-bounds) are folded back into a stay-in-place probability and the distribution is renormalised.

There are five tile types:

Type ASCII char Bias direction Description
SlipperyNorth u North Tilts movement probability toward north
SlipperySouth d South Tilts movement probability toward south
SlipperyEast r East Tilts movement probability toward east
SlipperyWest l West Tilts movement probability toward west
SlipperyTile S No directional bias; all four directions equally available

Configuration

Slip probabilities are set at environment construction and apply uniformly to all slippery tiles in the layout:

Parameter Type Description
slip_forward float Probability of moving in the intended direction
slip_left float Probability of slipping 90° left of intended direction
slip_right float Probability of slipping 90° right of intended direction
slip_backward float Probability of moving opposite to intended direction
action_stuck_prob float Probability of not moving at all
tilt_bonus float Extra probability mass added in the tile's tilt direction (directional tiles only)

Probabilities are automatically normalised so they always sum to 1, accounting for any blocked moves.


Adversaries

Adversaries are NPC agents that share the grid with the player. Each step, every adversary executes one action according to its assigned behaviour. If an adversary occupies the same cell as the agent, the agent is caught and the episode ends.

Behaviours

Behaviour Description
DoNothing Stationary; never moves
DoRandom Random cardinal movement each step
GoTo(target_pos) Pathfinds to a fixed target cell using A*
FollowPlayer Chases the agent using A*

Behaviours can be composed into a task list executed sequentially. When repeating=True the adversary loops through the task list indefinitely:

from minigrid.core.adversary import Adversary, GoTo, DoNothing

adv = Adversary(
    adversary_pos=(6, 6),
    adversary_dir=None,          # None for cardinal mode
    color="blue",
    tasks=[
        GoTo((2, 6)),
        DoNothing(2),            # wait 2 steps
        GoTo((6, 2)),
        DoNothing(2),
    ],
    repeating=True,
)
env.add_adversary(adv)

In cardinal mode adversaries have no facing direction and move axis-aligned. In omnidirectional mode they maintain a facing direction (0–3) and turn before moving.


Building Environments with ASCII

AsciiEnv lets you define a grid layout as a list of equal-length strings. Each character maps directly to a grid object:

Character Object
# Wall
. or Empty floor (valid spawn point if random_start=True)
G Goal
X Explicit floor tile
V Lava (terminates episode on entry)
S Generic slippery tile (no directional bias)
u SlipperyNorth
d SlipperySouth
l SlipperyWest
r SlipperyEast

Example

The Cliff0 environment defines a corridor with a lava strip and a slippery region below it:

from minigrid.envs.ascii_env import AsciiEnv

class MyEnv(AsciiEnv):
    def layout(self):
        return [
            "#########",
            "#A.VVV.G#",
            "#..VVV..#",
            "#..SSS..#",
            "#..SSS..#",
            "#..SSS..#",
            "#########",
        ]

Instantiate it with slip parameters:

env = MyEnv(
    agent_mode="cardinal",
    slip_forward=0.70,
    slip_left=0.15,
    slip_right=0.15,
)
obs, _ = env.reset()

Subclassing vs. inline construction

You can also pass a layout directly to AsciiEnv without subclassing:

env = AsciiEnv(
    layout=[
        "#########",
        "#A.....G#",
        "#..ddd..#",
        "#########",
    ],
    agent_mode="cardinal",
    slip_forward=0.8,
    slip_left=0.1,
    slip_right=0.1,
)

Automatic PRISM Translation

Any MinigridSafe environment can be exported to a PRISM model with a single call. The exporter inspects the grid, reads each tile's slip distribution, and emits a probabilistic model that Tempest can consume directly.

# Export as a string
prism_str = env.to_prism(model_type="mdp")

# Write directly to a file
env.write_prism("model.prism", model_type="mdp")

Model types

model_type When to use
"mdp" Single agent, no adversaries. Slippery tile transitions become probabilistic choices.
"smg" Adversaries present and modelled as independent players (Stochastic Multiplayer Game).

Adversary modes

When adversaries are present, the adversary_mode parameter controls how they are encoded:

adversary_mode Description
"none" Adversaries are ignored; produces a single-agent MDP.
"strategy" Adversaries follow their baked-in A* policy; produces a turn-based MDP.
"smg" Adversaries are free second/third players with their own action choices; produces an SMG.

What the exporter generates

  • FormulasAgentCannotMoveNorth/East/South/West (wall and boundary restrictions), AgentCaught (collision with any adversary)
  • Agent module — agent position variable and one probabilistic command per action, encoding slip distributions for each reachable slippery tile
  • Adversary modules — one module per adversary, with position variable and movement commands
  • Player blocks (SMG only) — assigns the agent's action labels to player 1 and each adversary's labels to players 2, 3, …

Full workflow with Tempest

import gymnasium as gym
import tempestpy.shielding as shielding
from minigrid.envs.ascii_env import AsciiEnv

# 1 — build and export the environment
env = AsciiEnv(
    layout=[...],
    agent_mode="cardinal",
    slip_forward=0.7,
    slip_left=0.15,
    slip_right=0.15,
)
env.reset()
env.write_prism("model.prism", model_type="mdp")

# 2 — synthesise a shield with Tempest
factory = shielding.ShieldFactory(
    model_path="model.prism",
    property_str="Pmin=? [F \"goal\"]",
)
shield = factory.compute(shielding.ShieldConfig(threshold=0.05))

# 3 — wrap the environment with the shield
shielded_env = shielding.PreShieldWrapper(env, shield)

Pre-built Environments

MinigridSafe ships with a set of registered environments that cover common benchmark scenarios:

Environment ID Description
Cliff-v0 Cliff-edge corridor; slip_forward=0.70, slip_left/right=0.15
Cliff1-v0 Cliff with tilt_bonus=0.10 on directional tiles
Cliff2-v0 Bidirectional cliff (north and south lava strips)
MiniGrid-LavaLakeS17-v0 Open 17×17 arena with central lava lake, cardinal mode
MiniGrid-LavaLakeS34-v0 Larger 34×34 variant
MiniGrid-LavaTrackS11-v0 Narrow lava-bordered track with slippery edges
MiniGrid-LavaPyramidS11-v0 Pyramid-shaped lava arrangement
MiniGrid-FourRoomsHeist-v0 Four-room layout, action_stuck_prob=0.20
AdversaryEnv-v0 9×9 grid with a single patrolling adversary
CircleAdversariesEnv-v0 Ring of adversaries surrounding the goal
MiniGrid-AdvTrack-19x19-v0 Adversary on a slippery track, agent_stuck_prob=0.20

All pre-built environments support both agent modes unless otherwise noted. Use gymnasium.make to instantiate them:

import gymnasium as gym

env = gym.make("Cliff-v0", agent_mode="cardinal")
obs, _ = env.reset()