J. Ichnowski, P. Jain, B. Stellato, G. Banjac, M. Luo, J. Gonzalez, I. Stoica, F. Borrelli and K. Goldberg
in Advances in Neural Information Processing Systems (NeurIPS), December 2021.@inproceedings{Ichnowski2021:NeurIPS, author = {J. Ichnowski and P. Jain and B. Stellato and G. Banjac and M. Luo and J. Gonzalez and I. Stoica and F. Borrelli and K. Goldberg}, title = {Accelerating quadratic optimization with reinforcement learning}, booktitle = {Advances in Neural Information Processing Systems (NeurIPS)}, year = {2021}, url = {https://papers.nips.cc/paper/2021/file/afdec7005cc9f14302cd0474fd0f3c96-Paper.pdf} }
First-order methods for quadratic optimization such as OSQP are widely used for large-scale machine learning and embedded optimal control, where many related problems must be rapidly solved. These methods face two persistent challenges: manual hyperparameter tuning and convergence time to high-accuracy solutions. To address these, we explore how Reinforcement Learning (RL) can learn a policy to tune parameters to accelerate convergence. In experiments with well-known QP benchmarks we find that our RL policy, RLQP, significantly outperforms state-of-the-art QP solvers by up to 3x. RLQP generalizes surprisingly well to previously unseen problems with varying dimension and structure from different applications, including the QPLIB, Netlib LP and Maros-Mészáros problems.