SAAC: Safe Reinforcement Learning as an Adversarial Game of Actor-Critics


Although Reinforcement Learning (RL) is effective for sequential decision-making problems under uncertainty, it still stumbles to thrive in real-world systems where risk or safety is a binding constraint. In this paper, we formulate the RL problem with safety constraints as a non-zero-sum game. This formulation leads to an adversarially guided actor-critic framework (SAAC), where an adversary tries to break the safety constraint while the RL agent tries to maximize the constrained value function given the adversary’s policy. First, we provide a minimax convergence analysis of our framework in the case of softmax policies. Unlike previous approaches, SAAC can address different safety criteria such as safe exploration, mean-variance risk sensitivity, and CVaR-like coherent risk sensitivity. Then, in each of these variations, we show the agent differentiates itself from the adversary’s unsafe actions in addition to learning to solve the task. Finally, we demonstrate the effectiveness of SAAC in solving challenging continuous control tasks while learning a policy that satisfies various safety constraints.

Under Review