2019/12/07-2019/12/09
On the convergence to stationary points of deterministic
and randomized feasible descent directions methods
Amir Beck
Tel-Aviv University
We study the class of nonsmooth nonconvex problems in which the objective is to minimize the difference between a continuously differentiable function (possibly nonconvex) and a convex (possibly nonsmooth) function over a convex polytope. This general class contains many types of problems, including difference of convex functions (DC) problems, and as such, can be used to model a variety of applications. Our goal is to attain a point satisfying the stationarity necessary condition, defined as the lack of feasible descent directions. In this work we prove that stationarity in our model can be characterized by a finite number of directions that positively span the set of feasible directions. We then use the latter to develop two methods, one deterministic and one random, whose accumulation points are proved to be stationary points. Supplementing discussion on positively spanning sets, and corresponding methods to obtain members of such sets, are also presented. Numerical experiments illustrate the appeal of obtaining a stationary point and the advantage of using the random method to do so.
The generalized Bregman distance
Regina Burachik
University of South Australia
Recently, a new distance has been introduced for the graphs of two point-to-set operators, one of which is maximally monotone. When both operators are the subdifferential of a proper lower semicontinuous convex function, this distance specializes under modest assumptions to the classical Bregman distance. We name this new distance the generalized Bregman distance, and we shed light on it with examples that utilize the other two most natural representative functions: the Fitzpatrick function and its conjugate. We provide sufficient conditions for convexity, coercivity, and supercoercivity: properties which are essential for implementation in proximal point type algorithms. We establish these results for both the left and right variants of this new distance. We construct examples closely related to the Kullback-Liebler divergence, which was previously considered in the context of Bregman distances, and whose importance in information theory is well known. In so doing, we demonstrate how to compute a difficult Fitzpatrick conjugate function, and we discover natural occurrences of the Lambert-w function, whose importance in optimization is of growing interest.
Quadratic convergence of SQP-like methods for convex-composite optimzation
James Burke
University of Washington
We discuss the local convergence theory of Newton and quasi-Newton methods for convex-composite optimization: minimize f(x) = h(c(x)), where h is an infinite-valued closed proper convex function and c is smooth. Our focus is on the case where h is infinite-valued convex piecewise linear-quadratic. The optimality conditions are embedded into a generalized equation, and we show the strong metric subregularity and strong metric regularity of the corresponding set-valued mappings. With these tools, we extend the classical convergence theory for Newton-type methods to this broader class of problems.
The subdifferential of measurable composite nonsmooth nonconvex functions and smoothing approximation
Xiaojun Chen
Hong Kong Polytechnic University
The subdifferential calculus for the expectation of nonsmooth nonconvex random integrands involves many fundamental and challenging problems in stochastic optimization. It is known that for convex integrands, the Clarke subdifferential of the expectation equals the expectation of their Clarke subdifferential. However, little is known about the calculation of Clarke subgradients for the expectation of nonsmooth nonconvex integrands. This talk presents smoothing approximations that enable to compute the Clarke subgradients of the expectation of nonsmooth nonconvex random integrands. A framework for how to proceed along this path is developed and then applied to a class of measurable composite max integrands. This class contains integrands from stochastic complementarity problems as well as stochastic optimization problems arising in statistical learning.
Critical points for Lipschitz functions
Aris Daniilidis
University of Chile
I will survey results concerning the size of Clarke critical values for special subclasses of Lipschitz continuous functions as opposed to the generic pathological situation of subdifferential saturation. At the end of the talk I will give examples showing that standard first-order methods may fail to detect critical points or to provide a descent scheme for the nonsmooth, nonconvex case.
On a semismooth* Newton method for generalized equations
Helmut Gfrerer
Johannes Kepler University Linz
In this talk we introduce a new notion of semismoothness* which pertains both sets as well as multifunctions. In the case of single-valued maps it is closely related with the standard notion of semismoothness introduced by Qi and Sun in 1993. semismoothness* can be equivalently characterized in terms of regular, limiting and directional limiting coderivatives. Then we present a semismooth* Newton method for solving inclusions and generalized equations (GEs), where the linearization concerns both the single-valued and the multi-valued part and it is performed on the basis of the respective coderivatives. Two conceptual algorithms will be presented and shown to converge locally superlinearly under relatively weak assumptions.
The talk is based on joint work with J. Outrata.
Convergence of an inexact ADMM for separable convex optimization
William Hager
University of Florida
Inexact alternating direction multiplier methods (ADMMs) are developed for solving general separable convex optimization problems with a linear constraint and with an objective that is the sum of smooth and nonsmooth terms. The approach involves linearized subproblems, a back substitution step, and either gradient or accelerated gradient techniques. Global convergence is established and convergence rate bounds are obtained for both ergodic and non-ergodic iterates.
Constraint splitting and projection methods for optimal control
Yalcin Kaya
University of South Australia
We consider a class of optimal control problems with constrained control variable. We split the ODE constraint and the control constraint of the problem so as to obtain two optimal control subproblems for each of which solutions can be written simply. Employing these simpler solutions as projections, we find numerical solutions to the original problem by applying four different projection-type methods: (i) Dykstra’s algorithm, (ii) the Douglas–Rachford (DR) method, (iii) the Aragón Artacho–Campoy (AAC) algorithm and (iv) the fast iterative shrinkage-thresholding algorithm (FISTA). To the knowledge of the authors, these kinds of (projection) methods have not previously been applied to continuous-time optimal control problems, which are infinite-dimensional optimization problems. The problem we study is posed in infinite-dimensional Hilbert spaces. Behaviour of the DR and AAC algorithms are explored via numerical experiments with respect to their parameters. An error analysis is also carried out numerically for a particular instance of the problem for each of the algorithms.
Gibbs measures in bilevel optimization problems
Lina Mallozzi
University of Naples Federico II
An approximated scheme for a bilevel optimization problem is presented by using Gibbs measures. The existence of solusions and Gamma-convergence results are proved both for optimization and for pessimison bilevel problems. Some applica5ons from Economics are discussed.
Joint work with G. Carlier (Universitè Paris Dauphine)
Superlinear convergence of the SQP method under parabolic regularity
Boris Mordukhovich
Wayne State University
This talk pursues a two-fold goal. Firstly, we aim to derive novel second-order characterizations of important robust stability properties of perturbed Karush-Kuhn-Tucker systems for a broad class of constrained optimization problems generated by parabolically regular sets. Secondly, the obtained characterizations are applied to establish well-posedness and superlinear convergence of the basic sequential quadratic programming method to solve parabolically regular constrained optimization problems.
Regularizations for bilevel variational problems
Jacqueline Morgan
University of Naples Federico II
The increasing interest into bilevel problems, due to the use of game theory in many economic or engineering applications, led to investigate the so-called bilevel variational problems where a variational problem (e.g. an optimization problem, a variational or a quasi-variational inequality) has to be solved at each level. Different solution concepts will be illustrated and two different types of regularization methods, useful also from a numerical point of view, will be presented. The first ones will obviate the lack of existence and/or stability of the solutions by using approximate solutions to the parametric variational problem of the lower level. The second ones will permit to overcome the possible non-uniqueness of the solutions to the parametric variational problem of the lower level and to define a costructive method which allows to select among the solutions.
A Variational Approach to Second-Order Optimality
Tyrrell Rockafellar
University of Washington
Conditions associated with local optimality, whether necessary or sufficient, have traditionally been approached through techniques of generalized differentiation. On the first-order level, this has been a long-standing success, although serious challenges remain for equilibrium constraints and the like. On the second-order level, a difficulty arises from the rather complex concepts of generalized second-derivatives and the sometimes inadequate calculus for determining them. In fact, sufficient second-order conditions of a practical sort, which are the most important aids for numerical methodology, are largely lacking outside of classical frameworks like nonlinear programming.
In nonlinear programming, well known conditions like so-called strong second-order optimality, are tied to a convexity-concavity property of an augmented Lagrangian function. It turns out that this pattern can be developed in vastly larger territory, beyond nonlinear programming, by exploiting the recently introduced concept of variational convexity.
Lagrangian-based methods for nonconvex optimization problems
Shoham Sabach
Israel Institute of Technology
Lagrangian-based methods have a long history, and recently, there has been an intensive renewed interests in these methods and, in particular, within the Alternating Direction of Multipliers (ADM) scheme. The recent literature on Lagrangian-based methods in general and ADM, in particular, is voluminous in the setting of convex problems. However, the situation in the nonconvex setting is far from being well-understood, and analysis of Lagrangian-based methods in the nonconvex setting remain scarce and challenging. In this talk, some recent work on the analysis and applications of Lagrangian-based methods in the nonconvex setting will be presented.
First order methods with smooth adaptable functions: analysis and applications
Marc Teboulle
Tel-Aviv University
We consider composite minimization problems, where the differentiable part of the objective fails to satisfy the standard global Lipschitz gradient property, a common and restrictive assumption used in almost all first order methods (FOM). To better capture the geometry of the given problem, we introduce the class of smooth adaptable functions. It naturally translates into a Non-Euclidean descent-like lemma, and leads to non-Euclidean proximal schemes based on Bregman distances. This simple framework lays the ground to many new perspectives for the convergence analysis and applications of FOM, and this talk will present some of these recent developments and results.
Group sparse optimization via lower order regularization
Xiaoqi Yang
Hong Kong Polytechnic University
In this paper, we investigate a group sparse optimization problem via lower order regularization in three aspects: theory, algorithm and application. In the theoretical aspect, by introducing a notion of group restricted eigenvalue condition, we establish an oracle property and a global recovery bound for any point in a level set of the regularization problem, and by virtue of modern variational analysis techniques, we also provide a local analysis of recovery bound for a path of local minima. In the algorithmic aspect, we apply the well-known proximal gradient method to solve the lower order regularization problems, either by analytically solving some specified lower order regularization subproblems, or by using the Newton method to solve general lower order regularization subproblems. In particular, we establish a local linear convergence rate of the proximal gradient method for solving the regularization problem under some mild conditions and by first proving a second-order growth condition. Finally in the aspect of application, we present some numerical results on both the simulated data and the real data in gene transcriptional regulation.
Primal perturbation analysis of error bounds for vector-valued conic inequalities
Xiyin Zheng
Yunnan University
In contrast to many existing results on error bounds being established by dual notions like the normal cones and subdifferntial, this paper is devoted to provide primal issues for stability of the error bound of vector-valued function $F$ at some point $\bar x$ with respect to a closed convex cone $K$. Using the Slater condition of the Bouligand (Clarke) tangent epiderivative $\widehat D_KF(\bar x)$ (and $D_KF(\bar x)$) with respect to the ordering cone $K$, several sufficient conditions are provided for $F$ to have an stable error bound at $\bar x$ with respect to $K$ when it undergoes small perturbations. In the composite-convexity case, they are also proved to be necessary. Moreover, we give some quantitative formulae for error bound moduli.