Accelerating quadratic optimization with reinforcement learning

J. Ichnowski, P. Jain, B. Stellato, G. Banjac, M. Luo, J. Gonzalez, I. Stoica, F. Borrelli and K. Goldberg

in Advances in Neural Information Processing Systems (NeurIPS), December 2021.
BibTeX  URL  Code 

  author = {J. Ichnowski and P. Jain and B. Stellato and G. Banjac and M. Luo and J. Gonzalez and I. Stoica and F. Borrelli and K. Goldberg},
  title = {Accelerating quadratic optimization with reinforcement learning},
  booktitle = {Advances in Neural Information Processing Systems (NeurIPS)},
  year = {2021},
  url = {}

First-order methods for quadratic optimization such as OSQP are widely used for large-scale machine learning and embedded optimal control, where many related problems must be rapidly solved. These methods face two persistent challenges: manual hyperparameter tuning and convergence time to high-accuracy solutions. To address these, we explore how Reinforcement Learning (RL) can learn a policy to tune parameters to accelerate convergence. In experiments with well-known QP benchmarks we find that our RL policy, RLQP, significantly outperforms state-of-the-art QP solvers by up to 3x. RLQP generalizes surprisingly well to previously unseen problems with varying dimension and structure from different applications, including the QPLIB, Netlib LP and Maros-Mészáros problems.