On the asymptotic behavior of the Douglas-Rachford and proximal-point algorithms for convex optimization

G. Banjac and J. Lygeros

Optimization Letters, 2021.
BibTeX  URL 

@article{BL:2021,
  author = {G. Banjac and J. Lygeros},
  title = {On the asymptotic behavior of the {D}ouglas-{R}achford and proximal-point algorithms for convex optimization},
  journal = {Optimization Letters},
  year = {2021},
  url = {https://doi.org/10.1007/s11590-021-01706-3},
  doi = {10.1007/s11590-021-01706-3}
}

The authors in (Banjac et al., 2019) recently showed that the Douglas-Rachford algorithm provides certificates of infeasibility for a class of convex optimization problems. In particular, they showed that the difference between consecutive iterates generated by the algorithm converges to certificates of primal and dual strong infeasibility. Their result was shown in a finite dimensional Euclidean setting and for a particular structure of the constraint set. In this paper, we extend the result to Hilbert spaces and a general nonempty closed convex set. Moreover, we show that the proximal-point algorithm applied to the set of optimality conditions of the problem generates similar infeasibility certificates.

Efficient performance bounds for primal-dual reinforcement learning from demonstrations

A. Kamoutsi, G. Banjac and J. Lygeros

in International Conference on Machine Learning (ICML), July 2021.
BibTeX 

@inproceedings{KBL:2021,
  author = {A. Kamoutsi and G. Banjac and J. Lygeros},
  title = {Efficient performance bounds for primal-dual reinforcement learning from demonstrations},
  booktitle = {International Conference on Machine Learning (ICML)},
  year = {2021}
}

We consider large-scale Markov decision processes with an unknown cost function and address the problem of learning a policy from a finite set of expert demonstrations. We assume that the learner is not allowed to interact with the expert and has no access to reinforcement signal of any kind. Existing inverse reinforcement learning methods come with strong theoretical guarantees, but are computationally expensive, while state-of-the-art policy optimization algorithms achieve significant empirical success, but are hampered by limited theoretical understanding. To bridge the gap between theory and practice, we introduce a novel bilinear saddle-point framework using Lagrangian duality. The proposed primal-dual viewpoint allows us to develop a model-free provably efficient algorithm through the lens of stochastic convex optimization. The method enjoys the advantages of simplicity of implementation, low memory requirements, and computational and sample complexities independent of the number of states.