FITRE | Assured Autonomy Tools Portal

FITRE

We propose an efficient new method for training neural networks in reinforcement learning tasks. Our method leverages advantages of both trust-region and natural gradient methods, by employing natural gradient direction as a way to approximately solve the trust-region sub-problems. We show that our method performs favorably compared with other well-tuned reinforcement learning methods on the F16 model. In particular, our method reaches higher reward using fewer iterations, which makes it outstanding among best-case results.

Keywords: high order optimization, reinforcement learning

Acknowledgements

This work is supported in part by the  DARPA Assured Autonomy  program.

Contacts

Sudhir Kylasa (Purdue University)

Bing Yuan (Purdue University)

ORGANIZATION

Purdue University, West Lafayette, Indiana, USA

References

Kylasa, S., Roosta-Khorasani, F., Mahoney, M., & Grama, A. (2018). GPU Accelerated Sub-Sampled Newton’s Method. ArXiv E-Prints, arXiv:1802.09113. (Original work published 2026)

Kylasa, S. B. (2019). HIGHER ORDER OPTIMIZATION TECHNIQUES FOR MACHINE LEARNING. Purdue University. http://doi.org/10.25394/PGS.11328545.v1