MinigridSafe
MinigridSafe is a fork of Farama's Minigrid extended for safe RL research. It adds stochastic tile mechanics, adversarial NPCs, an ASCII-based environment builder, and automatic export to PRISM — making it directly usable as a benchmark suite for Tempest's shield synthesis pipeline.
All environments retain full Gymnasium API compatibility. Two agent modes are supported:
- Cardinal — absolute N/S/E/W movement. No facing direction. Designed for stochastic environments and PRISM export.
- Omnidirectional — egocentric rotation + forward movement. Maintains a facing direction.
Stochastic features and PRISM export are primarily designed around cardinal mode.
pip install git+https://github.com/isec-tugraz/MinigridSafe
Stochastic Behaviour: Slippery Tiles
Slippery tiles introduce probabilistic movement. When an agent steps onto a slippery tile, the resulting position is sampled from a distribution rather than being deterministic. Blocked outcomes (walls, lava, out-of-bounds) are folded back into a stay-in-place probability and the distribution is renormalised.
There are five tile types:
| Type | ASCII char | Bias direction | Description |
|---|---|---|---|
SlipperyNorth |
u |
North | Tilts movement probability toward north |
SlipperySouth |
d |
South | Tilts movement probability toward south |
SlipperyEast |
r |
East | Tilts movement probability toward east |
SlipperyWest |
l |
West | Tilts movement probability toward west |
SlipperyTile |
S |
— | No directional bias; all four directions equally available |
Configuration
Slip probabilities are set at environment construction and apply uniformly to all slippery tiles in the layout:
| Parameter | Type | Description |
|---|---|---|
slip_forward |
float |
Probability of moving in the intended direction |
slip_left |
float |
Probability of slipping 90° left of intended direction |
slip_right |
float |
Probability of slipping 90° right of intended direction |
slip_backward |
float |
Probability of moving opposite to intended direction |
action_stuck_prob |
float |
Probability of not moving at all |
tilt_bonus |
float |
Extra probability mass added in the tile's tilt direction (directional tiles only) |
Probabilities are automatically normalised so they always sum to 1, accounting for any blocked moves.
Adversaries
Adversaries are NPC agents that share the grid with the player. Each step, every adversary executes one action according to its assigned behaviour. If an adversary occupies the same cell as the agent, the agent is caught and the episode ends.
Behaviours
| Behaviour | Description |
|---|---|
DoNothing |
Stationary; never moves |
DoRandom |
Random cardinal movement each step |
GoTo(target_pos) |
Pathfinds to a fixed target cell using A* |
FollowPlayer |
Chases the agent using A* |
Behaviours can be composed into a task list executed sequentially. When repeating=True the adversary loops through the task list indefinitely:
from minigrid.core.adversary import Adversary, GoTo, DoNothing
adv = Adversary(
adversary_pos=(6, 6),
adversary_dir=None, # None for cardinal mode
color="blue",
tasks=[
GoTo((2, 6)),
DoNothing(2), # wait 2 steps
GoTo((6, 2)),
DoNothing(2),
],
repeating=True,
)
env.add_adversary(adv)
In cardinal mode adversaries have no facing direction and move axis-aligned. In omnidirectional mode they maintain a facing direction (0–3) and turn before moving.
Building Environments with ASCII
AsciiEnv lets you define a grid layout as a list of equal-length strings. Each character maps directly to a grid object:
| Character | Object |
|---|---|
# |
Wall |
. or |
Empty floor (valid spawn point if random_start=True) |
G |
Goal |
X |
Explicit floor tile |
V |
Lava (terminates episode on entry) |
S |
Generic slippery tile (no directional bias) |
u |
SlipperyNorth |
d |
SlipperySouth |
l |
SlipperyWest |
r |
SlipperyEast |
Example
The Cliff0 environment defines a corridor with a lava strip and a slippery region below it:
from minigrid.envs.ascii_env import AsciiEnv
class MyEnv(AsciiEnv):
def layout(self):
return [
"#########",
"#A.VVV.G#",
"#..VVV..#",
"#..SSS..#",
"#..SSS..#",
"#..SSS..#",
"#########",
]
Instantiate it with slip parameters:
env = MyEnv(
agent_mode="cardinal",
slip_forward=0.70,
slip_left=0.15,
slip_right=0.15,
)
obs, _ = env.reset()
Subclassing vs. inline construction
You can also pass a layout directly to AsciiEnv without subclassing:
env = AsciiEnv(
layout=[
"#########",
"#A.....G#",
"#..ddd..#",
"#########",
],
agent_mode="cardinal",
slip_forward=0.8,
slip_left=0.1,
slip_right=0.1,
)
Automatic PRISM Translation
Any MinigridSafe environment can be exported to a PRISM model with a single call. The exporter inspects the grid, reads each tile's slip distribution, and emits a probabilistic model that Tempest can consume directly.
# Export as a string
prism_str = env.to_prism(model_type="mdp")
# Write directly to a file
env.write_prism("model.prism", model_type="mdp")
Model types
model_type |
When to use |
|---|---|
"mdp" |
Single agent, no adversaries. Slippery tile transitions become probabilistic choices. |
"smg" |
Adversaries present and modelled as independent players (Stochastic Multiplayer Game). |
Adversary modes
When adversaries are present, the adversary_mode parameter controls how they are encoded:
adversary_mode |
Description |
|---|---|
"none" |
Adversaries are ignored; produces a single-agent MDP. |
"strategy" |
Adversaries follow their baked-in A* policy; produces a turn-based MDP. |
"smg" |
Adversaries are free second/third players with their own action choices; produces an SMG. |
What the exporter generates
- Formulas —
AgentCannotMoveNorth/East/South/West(wall and boundary restrictions),AgentCaught(collision with any adversary) - Agent module — agent position variable and one probabilistic command per action, encoding slip distributions for each reachable slippery tile
- Adversary modules — one module per adversary, with position variable and movement commands
- Player blocks (SMG only) — assigns the agent's action labels to player 1 and each adversary's labels to players 2, 3, …
Full workflow with Tempest
import gymnasium as gym
import tempestpy.shielding as shielding
from minigrid.envs.ascii_env import AsciiEnv
# 1 — build and export the environment
env = AsciiEnv(
layout=[...],
agent_mode="cardinal",
slip_forward=0.7,
slip_left=0.15,
slip_right=0.15,
)
env.reset()
env.write_prism("model.prism", model_type="mdp")
# 2 — synthesise a shield with Tempest
factory = shielding.ShieldFactory(
model_path="model.prism",
property_str="Pmin=? [F \"goal\"]",
)
shield = factory.compute(shielding.ShieldConfig(threshold=0.05))
# 3 — wrap the environment with the shield
shielded_env = shielding.PreShieldWrapper(env, shield)
Pre-built Environments
MinigridSafe ships with a set of registered environments that cover common benchmark scenarios:
| Environment ID | Description |
|---|---|
Cliff-v0 |
Cliff-edge corridor; slip_forward=0.70, slip_left/right=0.15 |
Cliff1-v0 |
Cliff with tilt_bonus=0.10 on directional tiles |
Cliff2-v0 |
Bidirectional cliff (north and south lava strips) |
MiniGrid-LavaLakeS17-v0 |
Open 17×17 arena with central lava lake, cardinal mode |
MiniGrid-LavaLakeS34-v0 |
Larger 34×34 variant |
MiniGrid-LavaTrackS11-v0 |
Narrow lava-bordered track with slippery edges |
MiniGrid-LavaPyramidS11-v0 |
Pyramid-shaped lava arrangement |
MiniGrid-FourRoomsHeist-v0 |
Four-room layout, action_stuck_prob=0.20 |
AdversaryEnv-v0 |
9×9 grid with a single patrolling adversary |
CircleAdversariesEnv-v0 |
Ring of adversaries surrounding the goal |
MiniGrid-AdvTrack-19x19-v0 |
Adversary on a slippery track, agent_stuck_prob=0.20 |
All pre-built environments support both agent modes unless otherwise noted. Use gymnasium.make to instantiate them:
import gymnasium as gym
env = gym.make("Cliff-v0", agent_mode="cardinal")
obs, _ = env.reset()