Karush–Kuhn–Tucker conditions

In mathematical optimization, the Karush–Kuhn–Tucker (KKT) conditions, also known as the Kuhn–Tucker conditions, are first derivative tests (sometimes called first-order) necessary conditions for a solution in nonlinear programming to be optimal, provided that some regularity conditions are satisfied. Allowing inequality constraints, the KKT approach to nonlinear programming generalizes the method of Lagrange multipliers, which allows only equality constraints. The system of equations and inequalities corresponding to the KKT conditions is usually not solved directly, except in the few special cases where a closed-form solution can be derived analytically. In general, many optimization algorithms can be interpreted as methods for numerically solving the KKT system of equations and inequalities. In mathematical optimization, the Karush–Kuhn–Tucker (KKT) conditions, also known as the Kuhn–Tucker conditions, are first derivative tests (sometimes called first-order) necessary conditions for a solution in nonlinear programming to be optimal, provided that some regularity conditions are satisfied. Allowing inequality constraints, the KKT approach to nonlinear programming generalizes the method of Lagrange multipliers, which allows only equality constraints. The system of equations and inequalities corresponding to the KKT conditions is usually not solved directly, except in the few special cases where a closed-form solution can be derived analytically. In general, many optimization algorithms can be interpreted as methods for numerically solving the KKT system of equations and inequalities. The KKT conditions were originally named after Harold W. Kuhn and Albert W. Tucker, who first published the conditions in 1951. Later scholars discovered that the necessary conditions for this problem had been stated by William Karush in his master's thesis in 1939. Consider the following nonlinear minimization or maximization problem: where x ∈ X {displaystyle mathbf {x} in mathbf {X} } is the optimization variable chosen from a convex subset of R n {displaystyle mathbb {R} ^{n}} , f {displaystyle f} is the objective or utility function, g i ( i = 1 , … , m ) {displaystyle g_{i} (i=1,ldots ,m)} are the inequality constraint functions and h i ( i = 1 , … , ℓ ) {displaystyle h_{i} (i=1,ldots ,ell )} are the equality constraint functions. The numbers of inequalities and equalities are denoted by m {displaystyle m} and ℓ {displaystyle ell } respectively. Corresponding to the constraint optimization problem one can form the Lagrangian function where g ( x ) = ( g 1 ( x ) , … , g m ( x ) ) ⊤ {displaystyle mathbf {g} (mathbf {x} )=left(g_{1}(mathbf {x} ),ldots ,g_{m}(mathbf {x} ) ight)^{ op }} , h ( x ) = ( h 1 ( x ) , … , h ℓ ( x ) ) ⊤ {displaystyle mathbf {h} (mathbf {x} )=left(h_{1}(mathbf {x} ),ldots ,h_{ell }(mathbf {x} ) ight)^{ op }} . The Karush–Kuhn–Tucker theorem then states the following. Theorem. If ( x ∗ , μ ∗ ) {displaystyle (mathbf {x} ^{ast },mathbf {mu } ^{ast })} is a saddle point of L ( x , μ ) {displaystyle L(mathbf {x} ,mathbf {mu } )} in x ∈ X {displaystyle mathbf {x} in mathbf {X} } , μ ≥ 0 {displaystyle mathbf {mu } geq mathbf {0} } , then x ∗ {displaystyle mathbf {x} ^{ast }} is an optimal vector for the above optimization problem. Suppose that f ( x ) {displaystyle f(mathbf {x} )} and g i ( x ) {displaystyle g_{i}(mathbf {x} )} , i = 1 , … , m {displaystyle i=1,ldots ,m} , are concave in x {displaystyle mathbf {x} } and that there exists x 0 ∈ X {displaystyle mathbf {x} _{0}in mathbf {X} } such that g ( x 0 ) > 0 {displaystyle mathbf {g} (mathbf {x} _{0})>0} . Then with an optimal vector x ∗ {displaystyle mathbf {x} ^{ast }} for the above optimization problem there is associated a non-negative vector μ ∗ {displaystyle mathbf {mu } ^{ast }} such that L ( x ∗ , μ ∗ ) {displaystyle L(mathbf {x} ^{ast },mathbf {mu } ^{ast })} is a saddle point of L ( x , μ ) {displaystyle L(mathbf {x} ,mathbf {mu } )} . Since the idea of this approach is to find a supporting hyperplane on the feasible set Γ = { x ∈ X : g i ( x ) ≥ 0 , i = 1 , … , m } {displaystyle mathbf {Gamma } =left{mathbf {x} in mathbf {X} :g_{i}(mathbf {x} )geq 0,i=1,ldots ,m ight}} , the proof of the Karush–Kuhn–Tucker theorem makes use of the hyperplane separation theorem. Suppose that the objective function f : R n → R {displaystyle f:mathbb {R} ^{n} ightarrow mathbb {R} } and the constraint functions g i : R n → R {displaystyle g_{i}:,!mathbb {R} ^{n} ightarrow mathbb {R} } and h j : R n → R {displaystyle h_{j}:,!mathbb {R} ^{n} ightarrow mathbb {R} } are continuously differentiable at a point x ∗ {displaystyle x^{*}} . If x ∗ {displaystyle x^{*}} is a local optimum and the optimization problem satisfies some regularity conditions (see below), then there exist constants μ i ( i = 1 , … , m ) {displaystyle mu _{i} (i=1,ldots ,m)} and λ j ( j = 1 , … , ℓ ) {displaystyle lambda _{j} (j=1,ldots ,ell )} , called KKT multipliers, such that In the particular case m = 0 {displaystyle m=0} , i.e., when there are no inequality constraints, the KKT conditions turn into the Lagrange conditions, and the KKT multipliers are called Lagrange multipliers.

Parent Topic

Child Topic

No Parent Topic