Non-linear least squares

Non-linear least squares is the form of least squares analysis used to fit a set of m observations with a model that is non-linear in n unknown parameters (m ≥ n). It is used in some forms of nonlinear regression. The basis of the method is to approximate the model by a linear one and to refine the parameters by successive iterations. There are many similarities to linear least squares, but also some significant differences. Examples of NLLS are (i) the Probit Regression, (ii) Threshold Regression, (iii) Smooth Regression, (iv) Logistic Link Regression, (v) Box-Cox Transformed Regressors ( m ( x , θ i ) = θ 1 + θ 2 x ( θ 3 ) {displaystyle m(x, heta _{i})= heta _{1}+ heta _{2}x^{( heta _{3})}} ), and many others in Economic Theory. Non-linear least squares is the form of least squares analysis used to fit a set of m observations with a model that is non-linear in n unknown parameters (m ≥ n). It is used in some forms of nonlinear regression. The basis of the method is to approximate the model by a linear one and to refine the parameters by successive iterations. There are many similarities to linear least squares, but also some significant differences. Examples of NLLS are (i) the Probit Regression, (ii) Threshold Regression, (iii) Smooth Regression, (iv) Logistic Link Regression, (v) Box-Cox Transformed Regressors ( m ( x , θ i ) = θ 1 + θ 2 x ( θ 3 ) {displaystyle m(x, heta _{i})= heta _{1}+ heta _{2}x^{( heta _{3})}} ), and many others in Economic Theory. Consider a set of m {displaystyle m} data points, ( x 1 , y 1 ) , ( x 2 , y 2 ) , … , ( x m , y m ) , {displaystyle (x_{1},y_{1}),(x_{2},y_{2}),dots ,(x_{m},y_{m}),} and a curve (model function) y = f ( x , β ) , {displaystyle y=f(x,{oldsymbol {eta }}),} that in addition to the variable x {displaystyle x} also depends on n {displaystyle n} parameters, β = ( β 1 , β 2 , … , β n ) , {displaystyle {oldsymbol {eta }}=(eta _{1},eta _{2},dots ,eta _{n}),} with m ≥ n . {displaystyle mgeq n.} It is desired to find the vector β {displaystyle {oldsymbol {eta }}} of parameters such that the curve fits best the given data in the least squares sense, that is, the sum of squares is minimized, where the residuals (in-sample prediction errors) ri are given by for i = 1 , 2 , … , m . {displaystyle i=1,2,dots ,m.} The minimum value of S occurs when the gradient is zero. Since the model contains n parameters there are n gradient equations: In a nonlinear system, the derivatives ∂ r i ∂ β j {displaystyle {frac {partial r_{i}}{partial eta _{j}}}} are functions of both the independent variable and the parameters, so in general these gradient equations do not have a closed solution. Instead, initial values must be chosen for the parameters. Then, the parameters are refined iteratively, that is, the values are obtained by successive approximation, Here, k is an iteration number and the vector of increments, Δ β {displaystyle Delta {oldsymbol {eta }},} is known as the shift vector. At each iteration the model is linearized by approximation to a first-order Taylor polynomial expansion about β k {displaystyle {oldsymbol {eta }}^{k}!} The Jacobian, J, is a function of constants, the independent variable and the parameters, so it changes from one iteration to the next. Thus, in terms of the linearized model, ∂ r i ∂ β j = − J i j {displaystyle {frac {partial r_{i}}{partial eta _{j}}}=-J_{ij}} and the residuals are given by

Parent Topic

Child Topic

No Parent Topic