Optimal Shielding in Urban Traffic Control

[Avni et al.] synthesize optimal shields that overwrite the commands of a traffic-light controller modeled in the traffic simulator SUMO. See Figure 1 for an example traffic junction. The optimal-shield needs to balance the number of waiting cars per incoming road with the cost for interfering with the traffic light controller. The example is parametrised with the cut-off parameter k that defines the maximal modelled number of waiting cars per road.

A junction modeled in the traffic simulator SUMO depicting incoming cars.
Figure 1: A traffic junction modeled in the simulator SUMO.

Model Specification

We discuss the model given in Figure 2 in detail and describe the different sections of the code. The property used to synthesise the optimal shield is given in Figure 3.

The keyword smg defines the model to be a stochastic multiplayer game.

At first we define the actions controllable by each player with the player...endplayer struct. Square brackets indicate an action to be controlled by the described player. Simple strings add all unlabeled actions of the given module to the player struct.

The global variable move models the turn-based character of the coalition game. It is switched between the controller, the shield and the environment such that they move in a round-trip manner.

We then declare an undefined constant lmax which serves as the cut-off parameter of the variables N, E, S and W.

smg

player controller
  cont
endplayer

player shield
  [reduceNS_Shield], [reduceEW_Shield], [reduceNS_ShieldDev], [reduceEW_ShieldDev]
endplayer

player environment
  env
endplayer

global move : [0..2];

const int lmax;

global N : [0..lmax];
global E : [0..lmax];
global S : [0..lmax];
global W : [0..lmax];

module env
  [] move=0 -> 1/8 : (N'=min(lmax,N+1)) & (move'=1) +
               1/8 : (S'=min(lmax,S+1)) & (move'=1) +
               1/8 : (E'=min(lmax,E+1)) & (move'=1) +
               1/8 : (W'=min(lmax,W+1)) & (move'=1) +

               7/120 : (N'=min(lmax,N+1)) & (S'=min(lmax,S+1)) & (move'=1) +
               7/120 : (S'=min(lmax,S+1)) & (W'=min(lmax,W+1)) & (move'=1) +
               7/120 : (E'=min(lmax,E+1)) & (N'=min(lmax,N+1)) & (move'=1) +
               7/120 : (W'=min(lmax,W+1)) & (N'=min(lmax,N+1)) & (move'=1) +
               7/120 : (E'=min(lmax,E+1)) & (S'=min(lmax,S+1)) & (move'=1) +
               7/120 : (W'=min(lmax,W+1)) & (S'=min(lmax,S+1)) & (move'=1) +

               3/120 : (E'=min(lmax,E+1)) & (N'=min(lmax,N+1)) & (W'=min(lmax,W+1)) & (move'=1) +
               3/120 : (W'=min(lmax,W+1)) & (N'=min(lmax,N+1)) & (E'=min(lmax,E+1)) & (move'=1) +
               3/120 : (E'=min(lmax,E+1)) & (S'=min(lmax,S+1)) & (W'=min(lmax,W+1)) & (move'=1) +
               3/120 : (W'=min(lmax,W+1)) & (S'=min(lmax,S+1)) & (N'=min(lmax,N+1)) & (move'=1) +

               6/120 : (W'=min(lmax,W+1)) & (S'=min(lmax,S+1)) & (E'=min(lmax,E+1)) & (N'=min(lmax,N+1)) & (move'=1);
endmodule

module cont
  action : [0..1];

  [] move=1 -> (action'=0) & (move'=2);
  [] move=1 -> (action'=1) & (move'=2);
endmodule

module sh
  [reduceNS_Shield] move=2 & action=0    -> (N'=max(0,N-1)) & (S'=max(0,S-1)) & (move'=0);
  [reduceEW_Shield] move=2 & action=1    -> (W'=max(0,W-1)) & (E'=max(0,E-1)) & (move'=0);

  [reduceNS_ShieldDev] move=2 & action=1 -> (N'=max(0,N-1)) & (S'=max(0,S-1)) & (move'=0);
  [reduceEW_ShieldDev] move=2 & action=0 -> (W'=max(0,W-1)) & (E'=max(0,E-1)) & (move'=0);
endmodule

formula diff = pow(pow((N+S)-(E+W),2),0.5);
rewards "difference"
  true : diff;
endrewards

const double lambda = 0.8;
const double interference = 2 * lmax;

rewards "differenceWithInterferenceCost"
  [reduceNS_Shield] true : lambda * diff;
  [reduceEW_Shield] true : lambda * diff;
  [reduceNS_ShieldDev] true : lambda * diff + (1 - lambda) * interference;
  [reduceEW_ShieldDev] true : lambda * diff + (1 - lambda) * interference;
endrewards

Figure 2: PRISM input file for the modeled junction and traffic light system. [raw]

Using the module structs we define the environment as well as the behaviour of the controller and shield. In each time step between one and four cars may enter the simulation on the incoming roads.

The controller may then choose to give green lights to the lanes incoming from either north and south or east and west, modeled by the action variable. The shield may then choose to or not alter this decision. Depending on the shields decision the variables counting the amount of cars per incoming road will be decremented.

The shields goal is to minimize the difference between the amount of waiting cars with respect to the possible traffic light settings. This is captured by the formula diff.

Following the approach of Avni et. al. we define two constants interference and lambda. These will be used to introduce the differenceWithInterferenceCost reward structure. Each action the shield decides on gets a reward value associated with it, namely lambda * diff whenever it does not alter the controllers chosen action and lambda * diff + (1 - lambda) * interference if it does alter the chosen action.

The property in Figure 3 captures the shielding objective to minimize the reward in the long-run behaviour using both the Rmin and LRA operators for the player shield. We also ask TEMPEST to create the optimal shield and safe it to tlsShield.shield.

<tlsShield, Optimal> <<shield>> R{"differenceWithInterferenceCost"}min=? [ LRA ]

Figure 3: TEMPEST shielding specification for the optimal shield. [raw]

Results

The comparison of the synthesis-times from Avni et.al.’s implementation and Tempest are shown in Figure 4, showing a difference by orders of magnitudes in favor of Tempest.

The plot depicting the synthesis time comparison between the approach from Avni et. al. and TEMPEST
Fig.3: Runtime comparison between the approach from Avni et. al. and TEMPEST.

References

[Avni et. al.] G. Avni, R. Bloem, K. Chatterjee, T. A. Henzinger, B. Konighofer, and S. Pranger. Run-time optimization for learned controllers through quantitative games. In Computer Aided Verification - 31st Int. Conference, CAV 2019, New York City, NY, USA, July 15-18, 2019, Lecture Notes in Computer Science, pages 630–649. Springer, 2019.