Quasi-Newton method

Quasi-Newton methods are methods used to either find zeroes or local maxima and minima of functions, as an alternative to Newton's method. They can be used if the Jacobian or Hessian is unavailable or is too expensive to compute at every iteration. The 'full' Newton's method requires the Jacobian in order to search for zeros, or the Hessian for finding extrema. Quasi-Newton methods are methods used to either find zeroes or local maxima and minima of functions, as an alternative to Newton's method. They can be used if the Jacobian or Hessian is unavailable or is too expensive to compute at every iteration. The 'full' Newton's method requires the Jacobian in order to search for zeros, or the Hessian for finding extrema. Newton's method to find zeroes of a function g {displaystyle g} of multiple variables is given by x n + 1 = x n − [ J g ( x n ) ] − 1 g ( x n ) {displaystyle x_{n+1}=x_{n}-^{-1}g(x_{n})} , where [ J g ( x n ) ] − 1 {displaystyle ^{-1}} is the left inverse of the Jacobian matrix J g ( x n ) {displaystyle J_{g}(x_{n})} of g {displaystyle g} evaluated for x n {displaystyle x_{n}} . Strictly speaking, any method that replaces the exact Jacobian J g ( x n ) {displaystyle J_{g}(x_{n})} with an approximation is a quasi-Newton method. For instance, the chord method (where J g ( x n ) {displaystyle J_{g}(x_{n})} is replaced by J g ( x 0 ) {displaystyle J_{g}(x_{0})} for all iterations) is a simple example. The methods given below for optimization refer to an important subclass of quasi-Newton methods, secant methods. Using methods developed to find extrema in order to find zeroes is not always a good idea, as the majority of the methods used to find extrema require that the matrix that is used is symmetrical. While this holds in the context of the search for extrema, it rarely holds when searching for zeroes. Broyden's 'good' and 'bad' methods are two methods commonly used to find extrema that can also be applied to find zeroes. Other methods that can be used are the column-updating method, the inverse column-updating method, the quasi-Newton least squares method and the quasi-Newton inverse least squares method. More recently quasi-Newton methods have been applied to find the solution of multiple coupled systems of equations (e.g. fluid–structure interaction problems or interaction problems in physics). They allow the solution to be found by solving each constituent system separately (which is simpler than the global system) in a cyclic, iterative fashion until the solution of the global system is found. Noting that the search for a minimum or maximum of a scalar-valued function is nothing else than the search for the zeroes of the gradient of that function, quasi-Newton methods can be readily applied to find extrema of a function. In other words, if g {displaystyle g} is the gradient of f {displaystyle f} , then searching for the zeroes of the vector-valued function g {displaystyle g} corresponds to the search for the extrema of the scalar-valued function f {displaystyle f} ; the Jacobian of g {displaystyle g} now becomes the Hessian of f {displaystyle f} . The main difference is that the Hessian matrix is a symmetric matrix, unlike the Jacobian when searching for zeroes. Most quasi-Newton methods used in optimization exploit this property. In optimization, quasi-Newton methods (a special case of variable-metric methods) are algorithms for finding local maxima and minima of functions. Quasi-Newton methods are based on Newton's method to find the stationary point of a function, where the gradient is 0. Newton's method assumes that the function can be locally approximated as a quadratic in the region around the optimum, and uses the first and second derivatives to find the stationary point. In higher dimensions, Newton's method uses the gradient and the Hessian matrix of second derivatives of the function to be minimized. In quasi-Newton methods the Hessian matrix does not need to be computed. The Hessian is updated by analyzing successive gradient vectors instead. Quasi-Newton methods are a generalization of the secant method to find the root of the first derivative for multidimensional problems. In multiple dimensions the secant equation is under-determined, and quasi-Newton methods differ in how they constrain the solution, typically by adding a simple low-rank update to the current estimate of the Hessian. The first quasi-Newton algorithm was proposed by William C. Davidon, a physicist working at Argonne National Laboratory. He developed the first quasi-Newton algorithm in 1959: the DFP updating formula, which was later popularized by Fletcher and Powell in 1963, but is rarely used today. The most common quasi-Newton algorithms are currently the SR1 formula (for 'symmetric rank-one'), the BHHH method, the widespread BFGS method (suggested independently by Broyden, Fletcher, Goldfarb, and Shanno, in 1970), and its low-memory extension L-BFGS. The Broyden's class is a linear combination of the DFP and BFGS methods.

Parent Topic

Child Topic

No Parent Topic