ARCqK published in Mathematical Programming

3 minute read

Published: December 15, 2023

I am thrilled to share that the article Scalable adaptive cubic regularization methods has been published in the journal Mathematical Programming, Series A. This has been a really exciting journey with my co-authors Jean-Pierre Dussault and Dominique Orban on this really exciting work that I hope will help explore the numerical possibilities of ARC methods. The proposed implementation is a perfect fit for large-scale application as it solves the subproblem inexactly and only required Hessian-vector products, so no need to evaluate and store the Hessian matrix. As usual, the code has been done in Julia and is available in the folder paper in the Github repository AdaptiveRegularization.jl. Full text published version available from here, enjoy!

Unconstrained Optimization

We consider unconstrained optimization problems of the form $\underset{x \in \mathbb{R}^n}{\text{minimize}} f(x)$ where $f$ is a twice continuously differentiable function, and its Hessian is Lipschitz continuous. The aim is to study iterative algorithms that converges to a first or second-order stationary points.

A classical approach in this context is to use trust-region method, which is a very robust approach. The Julia package JSOSolvers.jl implements two of such approach, TRON and TRUNK. The book Trust region methods by Conn, Gould, and Toint is a must.

Adaptive Cubic Regularization (ARC)

ARC algorithms, recently explored by Cartis, Gould, and Toint (2011a) and Cartis, Gould, and Toint (2011b) are closely related to trust region methods in that steps are computed by solving a sequence of regularized subproblems. A major theoretical appeal of ARC over TR methods is their optimal worst-case complexity property. Whereas the number of function evaluations required to reach a point $x$ for which $|\nabla f(x)|\le\epsilon$ is $O(\epsilon^{-2})$ for TR, which is no better than steepest descent that number is $O(\epsilon^{-3/2})$ for ARC.

How We Do It?

The standard approach consists in performing aniterative search for the shift akin to solving the secular equation in trust-region methods. Such search requires computing the Cholesky factorization of a tentative shifted Hessian at each iteration, which limits the size of problems that can be reasonably considered. In this article, we propose a scalable implementation of ARC named ARCqK in which we solve a set of shifted systems concurrently by way of an appropriate modiﬁcation of the Lanczos formulation of the conjugate gradient (CG) method. At each iteration of ARCqK to solve a problem with $n$ variables, a range of $m ≪ n$ shift parameters is selected. The CG variant only requires one Hessian-vector product and one dot product per iteration, independently of $m$. Solves corresponding to inadequate shift parameters are interrupted early. All shifted systems are solved inexactly. Such modest cost makes our implementation scalable and appropriate for large-scale problems.

Numerical Results

The numerical results on CUTEst test problems presented in Dolan and Moré performance profile, with respect to the elapsed time, are comparing:

ARCqK with the implementation of ARC done in Fortran in the library GALAHAD;
ARCqK with a classical Steihaug-Toint trust-region method implemented in Julia with the same scheme on problems with $\geq 1000$ variables. More details are given in the article, but overall it shows good prospect for ARCqK as the algorithm is both fast and robust.

Our implementation is designed for large scale problems, so we expect to generalize these results to even larger problems.

Share on

Twitter Facebook LinkedIn

Presenting at JuMP-dev 2024 and Publishing in JuliaCon 2023 Proceedings

3 minute read

Published: August 30, 2024

I’m thrilled to share two major milestones in my recent work within the Julia ecosystem. First, I presented the latest developments in optimization solvers at JuMP-dev 2024, and second, my paper on JSOSuite.jl was accepted in The Proceedings of the JuliaCon Conferences.

These two achievements highlight both the ongoing evolution of JuliaSmoothOptimizers (JSO) and its growing impact on large-scale nonlinear optimization problems.

JuMP-dev 2024: Advancing Nonlinear Optimization with JuliaSmoothOptimizers

This year’s JuMP-dev workshop, held independently from JuliaCon for the first time in Montreal, offered a focused platform for deep dives into JuMP and its surrounding tools. In my presentation, I discussed the latest progress within the JuliaSmoothOptimizers (JSO) ecosystem, my slides and the replay.

At the core of my talk was an introduction to new solvers and packages like AdaptiveRegularization.jl, which address the unique challenges of large-scale optimization problems with Adaptive Regularization with Cubics. I emphasized the following key innovations:

Automatic Differentiation (AD) support and integration with JuMP for easier problem modeling.
Memory pre-allocation for in-place solvers, reducing runtime overhead.
Support for multi-precision solvers and GPU-based computations, essential for modern large-scale applications.
The value of factorization-free solvers, which excel in tackling large, complex problems, such as those in discretized PDE-constrained optimization.

For newcomers to JSO, JSOSuite.jl serves as a critical entry point, simplifying solver selection and benchmarking through automatic algorithm matching. This tool eliminates the complexity of choosing from multiple solvers by providing a user-friendly interface that adapts to the problem at hand. My talk also touched on the broader adoption and longevity of JSO, which now spans over 50 registered packages, making it one of the most comprehensive platforms for numerical optimization.

JuliaCon 2023: JSOSuite.jl – Simplifying Continuous Optimization

While JuMP-dev 2024 focused on recent developments, my publication in The Proceedings of the JuliaCon Conferences looks at the core philosophy and implementation behind JSOSuite.jl. Titled JSOSuite.jl: Solving Continuous Optimization Problems with JuliaSmoothOptimizers, the paper introduces JSOSuite.jl as a package designed to bring ease-of-use to complex optimization challenges.

JSOSuite.jl covers a range of problem types—from unconstrained to generally-constrained and least-squares problems—and eliminates the need for users to understand the intricate details of individual solvers. Instead, the package conducts a preliminary analysis of the problem and automatically selects the most appropriate solver, offering significant advantages to both experienced practitioners and newcomers alike.

This paper builds on the innovations within JSO, reinforcing its versatility and ease of use across various fields and applications. The package is a natural fit for researchers who need efficient, reliable solvers without the overhead of manually configuring them for different problem types.

Looking Forward

Both my presentation at JuMP-dev 2024 and the publication of the JSOSuite.jl paper reflect the significant strides made by the JuliaSmoothOptimizers organization over the past year. The JSO ecosystem is positioned to continue driving innovation in the field of numerical optimization.

I’m excited to see how these advancements will be applied across diverse optimization problems in the coming years and look forward to continuing this journey with the JSO community.

New Preprint on HAL: Exploring Projected Dynamical Systems in Geochemical Reactions

2 minute read

Published: July 05, 2024

This project holds a special place in my heart as it touches on the very applications in geochemistry that first drew me into research. Equilibrium reactions, particularly in slow processes like the water cycle in aquifers, have always fascinated me. Moreover, this paper represents an important milestone for one of the authors, Bastien, as it was part of his Ph.D. thesis. The use of projected dynamical systems, a model I am particularly fond of, adds an additional layer of personal significance to this work.

Performance Profile Benchmarking Tool

16 minute read

Published: June 25, 2024

The Dolan-More Performance Profile is a method used for comparing the performance of algorithms.

Empowering Research: The Vital Role of Citing Research Software for Reproducibility and Innovation

11 minute read

Published: February 25, 2024

As I started my Ph.D. journey in numerical optimization back in 2014, I noticed something that really stood out to me: despite the abundance of scientific papers discussing algorithms and their numerical results, the availability of corresponding open-source codes lagged far behind.

Tangi Migot