A decomposition-coordination approach for large-scale optimization
5
Citation
0
Reference
20
Related Paper
Citation Trend
Abstract:
We outline in this paper a general approach to solving distributed optimization problems of a certain type. We assume there are n processing elements, connected together so that they can share information. We expect that our approach would be particularly useful in large-scale problems, where a centralized approach may not be able to find a solution in an acceptable time, or the problem data may inherently be decentralized. However our approach is applicable to problems of any size, provided they meet the conditions we impose to ensure convergence. First we duplicate some (or all) of the problem variables, so as to decompose the constraints into purely local subsets. This produces an artificial decomposition and an increase in the number of variables. In order to make sure the solutions of the new problem and the original one coincide, we introduce consistency constraints which force all duplicated copies of a variable to be equal at the solution. These equality constraints are the only non-local constraints in the resulting transformed problem. We then apply the Auxiliary Problem Principle (APP) to the problem of finding a saddle-point of the Augmented Lagrangian of the transformed optimization problem. Informally, the APP involves linearizing any non-separable termsmore » in -the Augmented Lagrangian and adding convex/concave terms which can be chosen to be separable. Under conditions stated below, the new problem (after applying the APP) has the same saddle- point as the original problem. We describe an iterative algorithm to find the saddle-point of this new problem. By choosing the APP terms to be decomposable, the central step of this iterative algorithm decomposes into independent parallel minimizations, followed by Lagrange multiplier updates which involve only local information transfers.« lessKeywords:
Augmented Lagrangian method
Saddle point
Cite
Solution set
Cite
Citations (1)
Cite
Citations (1)
We consider the constrained nonlinear optimization problem where the feasible set is possibly empty, in which case no solution exists. However, in many real-life situations some decision still has to be made. We propose a method where the constraints are allowed to be violated, a so-called elastic approach. Our new goal is to find a near-feasible point with a “good” objective value. The approach we advocate is also useful for problems that have a solution but the solution is determined by solving subproblems. A subproblem may be infeasible even if the original problem is feasible. We focus on SQP (Sequential Quadratic Programming) methods, a popular class of methods for solving constrained nonlinear optimization problems. A characteristic feature of SQP methods is that they solve a sequence of quadratic subproblems with linear constraints. It is often assumed that all the subproblems are feasible. This does not always hold, even when the original nonlinear problem is feasible.
In the second part of this thesis, we turn to the problem of how to compute a direction of negative curvature for a symmetric matrix H=12Fx . When second derivatives are known we wish to be able to determine a point that satisfies the second-order (necessary) conditions for a minimizer. It can be shown that an essential feature of such algorithms is to be able to compute a direction of negative curvature for a symmetric matrix. We describe some new work on how to compute such directions. In particular, we show how, with little effort, some iterative methods can be used to obtain a good direction of negative curvature from a poor direction.
Sequential quadratic programming
Feasible region
Sequence (biology)
Matrix (chemical analysis)
Cite
Citations (17)
Now Classical First-Order (FO) algorithms of convex optimization, such as Mirror Descent algorithm or Nesterov's optimal algorithm for smooth convex optimization, are well known to have optimal (theoretical) complexity estimates which do not depend on the problem dimension. However, to attain the optimality, the domain of the problem should admit a good proximal setup. The latter essentially means that (i) the problem domain should satisfy certain geometric conditions (or be of geometry), and (ii) the practical use of these methods is conditioned by our ability to solve efficiently an auxiliary optimization task - computing the proximal transformation -- at each iteration of the method. More often than not these two conditions are satisfied in optimization problems arising in computational learning, what explains the fact that FO methods of proximal type recently became methods of choice when solving various learning problems. Yet, they meet their limits in several important problems such as multi-task learning with large number of tasks, where the problem domain does not exhibit favorable geometry, and learning and matrix completion problems with nuclear norm constraint, when the numerical cost of solving the auxiliary problem becomes prohibitive in large-scale problems. We propose a novel approach to solving nonsmooth optimization problems arising in learning applications where Fenchel-type representation of the objective function is available. The approach is based on applying FO algorithms to the dual problem and using the accuracy certificates supplied by the method to recover the primal solution. While suboptimal in terms of accuracy guaranties, the proposed approach does not rely upon good proximal setup for the primal problem but assume that the problem domain admits a Linear Optimization oracle -- the ability to efficiently maximize a linear form on the domain of the primal problem.
Conic optimization
Cite
Citations (0)
To iteratively solve large scale optimization problems in various contexts like planning, operations, design etc., we need to generate descent directions that are based on linear system solutions. Irrespective of the optimization algorithm or the solution method employed for the linear systems, ill conditioning introduced by problem characteristics or the algorithm or both need to be addressed. In [GL01] we used an intuitive heuristic approach in scaling linear systems that improved performance of a large scale interior point algorithm significantly. We saw a factor of 10*3* improvements in condition number estimates. In this paper, given our experience with optimization problems from a variety of application backgrounds like economics, finance, engineering, planning etc., we examine the theoretical basis for scaling while solving the linear systems. Our goal is to develop reasonably scaling schemes with sound theoretical basis. We introduce concepts and define scaling schemes in section (1), as well as explain related work in this area. Scaling has been studied extensively and though there is a broad agreement on its importance, the same cannot be said about what constitutes good scaling. A theoretical framework to scale an m x n real matrix is established in section (2). We use the first order conditions associated with the Euclidean metric to develop iterative schemes in section (2.3) that approximate solution in O(mn) time for real matrice. We discuss symmetry preserving scale factors for an n x n symmetric matrix in (3). The importance of symmetry preservation is discussed in section (3.1). An algorithm to directly compute symmetry preserving scale factors in O(n2) time based on Euclidean metric is presented in section (3.4) We also suggest scaling schemes based on rectilinear norm in section (2.4). Though all p-norms are theoretically equivalent, the importance of outliers increases as p increases. For barrier methods, due to large diagnal corrections, we believe that the taxicab metric (p = 1) may be more appropriate. We develop a linear programming model for it and look at a reduced dual that can be formulated as a minimum cost flow problem on networks. We are investigating algorithms to solve it in O(mn) time that we require for an efficient scaling procedure. We hope that in future special structure of the reduced dual could be exploited to solve it quickly. The dual information can then be used to compute the required scale factors. We discuss Manhattan metric for symmetric matrices in section (3.5) and as in the case of real matrices, we are unable to propose an efficient computational scheme for this metric. We look at a linearized ideal penalty function that only uses deviations out of the desired range in section (2.5). If we could use such a metric to generate an efficient solution, then we would like to see impact of changing the range on the numerical behavior.
Interior point method
Multidimensional scaling
Matrix (chemical analysis)
Basis (linear algebra)
Cite
Citations (1)
We consider the problem of efficiently solving large-scale linear least squares problems that have one or more linear constraints that must be satisfied exactly. Whilst some classical approaches are theoretically well founded, they can face difficulties when the matrix of constraints contains dense rows or if an algorithmic transformation used in the solution process results in a modified problem that is much denser than the original one. To address this, we propose modifications and new ideas, with an emphasis on requiring the constraints are satisfied with a small residual. We examine combining the null-space method with our recently developed algorithm for computing a null space basis matrix for a ``wide'' matrix. We further show that a direct elimination approach enhanced by careful pivoting can be effective in transforming the problem to an unconstrained sparse-dense least squares problem that can be solved with existing direct or iterative methods. We also present a number of solution variants that employ an augmented system formulation, which can be attractive when solving a sequence of related problems. Numerical experiments using problems coming from practical applications are used throughout to demonstrate the effectiveness of the different approaches.
Least-squares function approximation
Sequence (biology)
Matrix (chemical analysis)
Null (SQL)
Cite
Citations (0)
Optimization is a major field in applied mathematics. Many applications involve the search of the best solution to a problem according to some criterion. Depending on the considered optimization problem, finding the best solution is not always possible in a reasonable amount of time. Heuristic algorithms are often used when the problem is too difficult to solve exactly. These methods are used to speed up the search for a good solution but they do not guarantee that an optimal solution will be found. In this thesis, we explore such heuristic approaches for three different matrix problems. First, we study the minimum-volume bounding box problem, which consists in finding the smallest rectangular parallelepiped enclosing a given set of points in the three-dimensional space. This problem appears for example in collision detection, which is a very important issue in computational geometry and computer graphics. In these applications, a solution has to be determined in a very short amount of time. We propose a new hybrid algorithm able to approximate optimal bounding boxes at a low computational cost. In particular, it is several orders of magnitude faster than the only currently known exact algorithm. Second, we investigate the subset selection problem. Given a large set of features, we want to choose a small subset containing the most relevant features while removing the redundant ones. This problem has applications in data mining since this can be seen as a dimensionality reduction problem. We develop several windowed algorithms that tackle the subset selection problem for the maximum-volume criterion, which is NP-hard. Finally, we address the topic of the approximation of the joint spectral radius. This quantity characterizes the growth rate of product of matrices and is NP-hard to approximate. The joint spectral radius appears in many fields, including system theory, graph theory, combinatorics, language theory... We present an experimental study of existing approaches and propose a new genetic-based algorithm that is able to find bounds on the joint spectral radius in a short amount of time.
Matrix (chemical analysis)
Parallelepiped
Bounding overwatch
Set cover problem
Minimum bounding box
Cite
Citations (4)
Quadratic programming is a fundamental problem in constrained optimization. One of the simplest quadratic programs has only simple bounds. In this thesis, we present several algorithms for the numerical solution of quadratic programs with simple bounds along with numerical results. All of our algorithms are designed for or applicable to the case where the Hessian matrix is large and sparse.
Direct active set methods traditionally have not been used for large sparse problems because their storage requirements and data structure manipulations were thought to be prohibitive and because typical dense active set methods require a large number of iterations if the initial active set is very different from that at the solution. We show how a standard direct active set method for solving definite and indefinite quadratic programs with simple bounds can be efficiently implemented for large sparse problems. We show that all of the necessary factorizations can be carried out in a static data structure that can be set up before the numeric computations begin and requires only as much space as one Cholesky factorization of the Hessian of the quadratic. To allow us to add and delete several constraints from the active set at each iteration, we use projection techniques and a new search direction for the indefinite case. We also propose a heuristic method for finding a starting point that reduces the number of iterations required in practice.
Iterative methods are attractive for large sparse quadratic programs because they require less data structure manipulation and hence are considerably easier to implement. The numerical results available for existing iterative methods indicate that they have linear rates of convergence. To avoid difficulties we experienced because of the feasibility requirement of interior point methods, we show how to reformulate a convex quadratic program with simple bounds as an unconstrained piecewise quadratic function with the same optimality conditions. We present an iterative method for solving this new problem that is superlinearly convergent and is suitable for sparse problems.
Hessian matrix
Active set method
Cite
Citations (1)
Author(s): Kizilkale, Can | Advisor(s): Chandrasekaran, Shivkumar | Abstract: This thesis consists of three parts focusing on three different problems. All of these problems have optimization at heart yet they are related to different branches of optimization.Our first problem is on the convergence rate of two moment based methods, Polyak's Heavy Ball and Nesterov's Accelerated Gradient, employing adaptive restart. We introduce two criteria and show that under such criteria these methods have linear convergence. Strongly convex functions satisfy these criteria hence the results are valid for such functions. To the best of our knowledge, our result is the first convergence result for the adaptive restart scheme. We then introduce a novel restarting criteria which are highly tunable and also satisfies linear convergence.Our second problem is on computation of the sparsest solution to an underdetermined system of linear equations. We introduce an extrapolation procedure which computes the sparsest solution from a penalized relaxation of the problem via Huber function. This extrapolation procedure uses a condition called sign constancy and we show that if one works with extreme points this can be removed. We present necessary and sufficient conditions for the recovery of the sparsest solution by penalized Huber loss function and ties among different solutions. Our last problem of concern is on Network flows. In this work in progress, we investigate existence and computation of the equilibrium in trade networks with constraints. To the best of our knowledge, these networks are only investigated under simple capacity constraints on each link. Here we first start with a negative result where we give a counterexample where the link capacities live in a polymatroid yet there is no integer equilibrium point in the feasible region. We then move on to investigate the cases where the constraints are less restrictive.
Underdetermined system
Cite
Citations (0)
Solver
Cite
Citations (46)