REVEL | Assured Autonomy Tools Portal

REVEL

Revel is a partially neural reinforcement learning (RL) framework for provably safe exploration in continuous state and action spaces. A key challenge for provably safe deep RL is that repeatedly verifying neural networks within a learning loop is computationally infeasible. We address this challenge using two policy classes: a general, neurosymbolic class with approximate gradients and a more restricted class of symbolic policies that allows efficient verification. Ourlearning algorithm is a mirror descent over policies: in each iteration, it safely lifts a symbolic policy into the neurosymbolic space, performs safe gradient updates to the resulting policy, and projects the updated policy into the safe symbolic subset, all without requiring explicit verification of neural networks. Our empirical results show that Revel enforces safe exploration in many scenarios in which Constrained Policy Optimization does not, and that it can discover policies that outperform those learned through prior approaches to verified exploration.

Keywords: Neurosymbolic RL, verification, safe exploration

Acknowledgements

This work is supported in part by the  DARPA Assured Autonomy  program.

Contacts

Swarat Chaudhuri

ORGANIZATION

University Of Texas, Austin, TX, USA

Contributors

Swarat Chaudhuri

References

Anderson, G., Verma, A., Dillig, I., & Chaudhuri, S. (2020). Neurosymbolic Reinforcement Learning with Formally Verified Exploration. In Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, NeurIPS 2020, December 6-12, 2020, virtual. Retrieved from https://proceedings.neurips.cc/paper/2020/hash/448d5eda79895153938a8431919f4c9f-Abstract.html