Shield Synthesis for Safe RL
Welcome to the Home of Tempest!
This is the project page of Tempest, our shield synthesis tools for probabilistic environments, and tempestpy, aimed at making the integration of shielding into your favorite reinforcement learning frameworks as easy as possible. You can visit the respective pages for Tempest and tempestpy for guides on how to install it on your machine.
The aim of this project is to foster the research of shielded reinforcement learning. Alongside the synthesis tools, we present MinigridSafe and Minigrid2PRISM. MinigridSafe is an extension of the popular Minigrid library, adding features that allow the modelling of uncertainty and additional, potentially adversarial, actors into grid world environments. Visit MinigridSafe for more information.
Tempest uses a discrete, stochastic model of an environment to synthesize a shield. In order to automate the integration of shields into the training of agents in MinigridSafe environments, we provide Minigrid2PRISM. Minigrid2PRISM maps the internal dynamics of any MinigridSafe into a symbolic model in the PRISM modelling language. Using these models, shielded training can be fully automated using tempestpy and Minigrid2PRISM.
If you want to try out how shielding can be used to enable safe reinforcement learning, we highly recommend you to try Tempest in action. This project packs all the relevant source code and uses docker to build a virtualized environment in order for you to try out the framework. The container starts a jupyter notebook server to provide a nice frontend with visualizations of the synthesized shields.
Tempest and tempestpy are maintained by the Trusted AI group at Graz University of Technology. You can reach out to us via mail.