Published: Sept. 17, 2019

Ashutosh Trivedi

Department of Computer Science, University of Colorado Boulder

Reinforcement Learning and Formal Requirements

Reinforcement learning is an approach to controller synthesis where agents rely on reward signals to choose actions in order to satisfy the requirements implicit in reward signals. Oftentimes non-experts have to come up with the requirements and their translation to rewards under significant time pressure, even though manual translation is time-consuming and error-prone. For safety-critical applications of reinforcement learning, a rigorous design methodology is needed and, in particular, a principled approach to requirement specification and to the translation of objectives into the form required by reinforcement learning algorithms.

Formal logic provides a foundation for the rigorous and unambiguous requirement specification of learning objectives. However, reinforcement learning algorithms require requirements to be expressed as scalar reward signals. We discuss a recent technique, called limit-reachability, that bridges this gap by faithfully translating logic-based requirements into the scalar reward form needed in model-free reinforcement learning. This technique enables the synthesis of controllers that maximize the probability to satisfy given logical requirements using off-the-shelf, model-free reinforcement learning algorithms.

 

Speaker Bio: Ashutosh Trivedi is an assistant professor in the Department of Computer Science at the University of Colorado Boulder. He is affiliated with the Programming Languages and Verification Group and the Theory Group at the University of Colorado Boulder.His research focuses on applying rigorous mathematical reasoning techniques to design and analyze safe and secure cyber-physical systems(CPS) with guaranteed performance. He investigates foundational issues (decidability and complexity) related to modeling and analysis of CPS as well as practically focused tools that can be used by practitioners to analyze large systems at scale.