Slide 1:Topic 2: Optimality Criteria A function f(x) is a rule that assigns to every choice of x a unique value y = f(x). x is called the independent variable
y is called the dependent variable
Slide 2:2.1 PROPERTIES OF SINGLE-VARIABLE FUNCTIONS Let R be the set of the real numbers and S the domain of x.
When S = R, we have an unconstrained function:
f(x) = x3 + 2 x2 – x + 3, for all x R
When S R, we have a constrained function:
f(x) = x3 + 2 x2 – x + 3, for all x S = {x| -5 ≤ x ≤ 5}
Slide 3:Definition: Continuous Functions
A function f(x) is continuous at x = c if and only if it meets the following conditions: Properties of continuous functions:
Sums or products of continuous functions are continuous
The ratio of two continuous functions is continuous at all points where the denominator does not vanish
Slide 4:Definition: Monotonic Functions
A function f(x) is monotonic (either increasing or decreasing) if for any two points x1 and x2 with x1≤ x2, it follows that
f(x1) ≤ f(x2) monotonically increasing
f(x1) ≥ f(x2) monotonically decreasing
Slide 5:The function is monotonically decreasing for x* ≤ 0 and monotonically increasing for x* ≥ 0
The function attains its minimum at x = x*
The function is monotonic on either side of the minimum
The function is said to be a unimodal function Unimodal function
Slide 6:Definition
A function is unimodal on the interval a ≤ x ≤ b if and only if it is monotonic on either side of the single optimal point x*.
If x* is the single minimum point of f(x) in the range a ≤ x ≤ b, then f(x) is unimodal on the interval if and only if for any two points x1 and x2:
x* ≤ x1 ≤ x2 implies f(x*) ≤ f(x1) ≤ f(x2)
x* ≥ x1 ≥ x2 implies f(x*) ≤ f(x1) ≤ f(x2)
Slide 7:OPTIMALITY CRITERIA FOR SINGLE-VARIABLE FUNCTIONS We want to answer this question:
How can we determine whether a given point x* is the optimal solution? We will develop a set of optimality criteria for determining whether a given solution is optimal.
Slide 8:Definition
A function f(x) defined on a set S attains its global minimum at a point x** S if and only if
f(x**) ≤ f(x) for all x S
A function f(x) defined on S has a local minimum at a point x* S if and only if
f(x*) ≤ f(x) for all x satisfying |x - x*| <
Slide 10:Theorem 2.1 (Necessity Theorem) The necessary condition for x* to be a local minimum of f on the interval (a, b), provided that f is differentiable, is
Slide 11:Definition A stationary point is a point x* at which
An inflection point (saddle point) is a stationary point that does not correspond to a local optimum.
Slide 12:Theorem 2.2 (Sufficiency Theorem) Suppose at x* the first derivative is zero and the first non-zero higher order derivative is denoted by n.
If n is odd, x* is an inflection point
If n is even, x* is a local optimal
If that derivative is (+), x* is a local minimum
If that derivative is (-), x* is a local maximum
Slide 13:Example Consider the function
Which of the following points are minimizers of f(x)?
x1 = 0
x2 = 1
x3 = 2
x4 = 3
Slide 14:Solution We will apply the sufficiency theorem on the points. First, we eliminate any point using the zero-first-derivative test. The first derivative of f(x) is
Clearly all the points make the first derivative equal to zero, hence, all are labeled as critical points which potentially can be minimizers of the function f(x). Now, we test the second derivative and take out any point at which the second derivative is negative:
f‘’(x1) = 0
f‘’(x2) = 60
f‘’(x3) = -120
f‘’(x4) = 540
Slide 15:The point x3 can not be a minimizer, so it should be eliminated.
Since the second derivatives at x2 and x4 are positive, they are local minimizers.
The second derivative at x1 is zero, hence no definitive conclusion can be make and a higher derivative is needed:
f‘’’(x1) = -360
This is an odd-order derivative and is nonzero, hence, x1 is a saddle point.
Slide 16:Functions of Several Variables We will develop conditions for the characterization of optima for functions of several variables.
The unconstrained multi-variable problem is written as
min f(x) x RN
where x is a vector of the decision variables.
We assume f and its derivatives exist and are continuous everywhere.
We assume the minimum does not correspond to a point of discontinuity (either f or the gradient).
Slide 17:OPTIMALITY CRITERIA The Taylor expansion of a function of several variables is
Slide 18:The change in f(x) as a result of change in x is given by:
If is a local minimum, then we have to have
The point is a global minimum if the above relation is true for all x RN.
Slide 19:Theorem Necessary Conditions
For x* to be a local minimum, it is necessary that
Slide 20:When the gradient of f(x) is zero, we have
Now, the sign of f is dependent on the sign of the quadratic form
If Q(x) > 0, then is a minimum.
If Q(x) < 0, then is a maximum.
Slide 21:From linear algebra,
Based on this, we can say that the stationary point is
Slide 22:Theorem 3.2 Sufficient Conditions
If
and
then x* is a local minimum.
Slide 23:Example 3.1 See the example and the solution in the book.
Slide 24:Constrained Optimality Criteria In this part, we will develop necessary and sufficient conditions of optimality for constrained problems.
The conditions we have developed previously no longer hold because of the presence of constraints.
Slide 25:EQUALITY-CONSTRAINED PROBLEMS Firstly, we consider the optimization problem involving several equality constraints:
Slide 26:LAGRANGE MULTIPLIERS We can convert the constrained problem to an equivalent unconstrained problem with the help of certain unspecified parameters known as Lagrange multipliers.
Consider the problem
The method of Lagrange multipliers converts it to the unconstrained problem The Lagrange function The Lagrange multiplier
Slide 27:There is no sign restriction on .
Suppose for = o, the unconstrained minimum of L(x; ) with respect to x occurs at x = xo and xo satisfies h1(xo) = 0.
Then, xo minimizes f(x) subject to h1(x).
This is because for h1(x) = 0, min L(x; ) = min f(x).
The challenge is to determine the appropriate value of = o so that the unconstrained minimum point xo satisfies h1(x) = 0.
This can be done by treating as a variable, finding the unconstrained minimum of L as a function of , and adjusting such that h1(x) = 0 is satisfied.
Slide 28:Example 5.2 See the example and the solution in the book.
Slide 29:Example 5.3 See the example and the solution in the book.
Slide 30:The Lagrange multiplier method can be extended to several equality constraints.
The Lagrange function becomes
Here, 1, 2, …, k are the Lagrange multipliers
Slide 31:To solve the Lagrange problem, we set the partial derivatives of L with respect to x equal to zero and add to them the constrains:
Slide 32:These equations form a system of nonlinear equations of N+K equations in N+K variables.
The critical points of L are tested by computing the Hessian matrix of L with respect to x.
Slide 33:ECONOMIC INTERPRETATION OF THE LAGRANGE MULTIPLIERS The Lagrange multipliers have an important interpretation as shadow prices of the constraints.
Consider the optimization problem
The Lagrange function is
Slide 34:To find the critical points of L:
Let 1o be the optimal multiplier and xo be the optimal solution such that h1(xo)=b1 and f(xo) = L(xo; 1o) = fo.
Notice that the optimal solution (xo; 1o) is now a function of b1.
The change in fo due to change in b1 is given by
Slide 35:The partial derivative of h1(x) – b1 = 0 is
Multiply this equation by 1o and subtract from the previous equation:
Slide 36:This equation becomes
The optimal value of the Lagrange multiplier 1o is the change in the optimal value of the objective function per unit increase in the right-hand-side constant of the constraint.
Slide 37:For an optimization problem with K constraints:
Then;
Slide 38:Example Consider the problem
The optimal solution is
Slide 39:What will the optimal value of the objective function be if b1 is changed to 2.05?
The optimal objective value is
Slide 40:5.4 KUHN-TUCKER CONDITIONS As we saw, the Lagrange multipliers can be used to develop optimality criteria for the equality constrained optimization problems.
Kuhn and Tucker extend this theory to include equality and inequality constrained problems:
Slide 41:Definition If we can identify the inactive constraints at the optimum before solving the problem, we can delete those constraints from the model and reduce the problem size.
Kuhn and Tucker developed the necessary and sufficient optimality conditions for the NLP problem.
It is assumed that f, gj, and hk are differentiable.
Slide 42:The general NLP problem is given by
To solve this problem, we write the Lagrange function:
To find the critical points of L,
We should adjust the values of uj and k until
min L = min f
Slide 43:5.4.1 Kuhn-Tucker Conditions or Kuhn-Tucker Problem The KT conditions are stated as:
Slide 44:Example 5.4 See the example and the solution in the book
Slide 45:KUHN-TUCKER THEOREMS Consider the NLP problem
Let f, g, and h be differentiable functions and x* be a feasible solution to the NLP problem. Let I = {j|gj(x*) = 0}. Further, gj(x*) for j I and hk(x*) for k = 1,…, K are linearly independent. If x* is an optimal solution to the NLP, then there exists a (u*, *) such that (x*, u*, *) solves the KT problem. Theorem 5.1 Kuhn-Tucker Necessity Theorem
Slide 46:The conditions gj(x*) for jI and hk(x*) for k = 1, …, K are linearly independent at the optimum is known as the constraint qualification.
The constraint qualification holds in the following cases:
When all the constraints are linear.
When all the inequality constraints are concave functions and the equality constraints are linear and there exists at least one feasible x that is strictly inside the feasible region of the inequality constraints.In other words, there exists an x such that gj(x) > 0 for j = 1,…, J and hk(x) = 0 for k = 1,…, K.
Slide 47:When the constraint qualification is not met at the optimum, there may not exist a solution to the Kuhn-Tucker problem.
Therefore, do not apply the Kuhn-Tucker optimality conditions when the constraint qualification is not met.
Slide 48:Example 5.6 Consider the NLP problem
Check the optimality of x = 2. Solution First we have to write the problem in our standard form:
Slide 49:Before we can apply the necessity theorem, we have to check that the constraint qualification holds at x = 2.
Since g1 and g2 are linear function, the constraint qualification holds. (See Case 1 in page 196). Therefore, we can apply the theorem.
We need to check that there is u such that x and u are a solution to the KT problem.
The KT conditions are
Slide 50:For x = 2, Condition (4) requires u1 = u2 = 0.
Condition (1) is violated (-4 0).
There is no solution to the KT problem for x = 2.
Hence, x = 2 cannot be optimal.
Slide 51:Test of Linearly Independent Vectors The vectors v1, v2, …, vn are linearly independent if
1 v1+ 2 v2 + …+ n vn 0
That is, the multiple sum of the vectors is not equal to zero for not all equal to zero.
A quick and easy test for linearly independent vectors is
Slide 52:Example Consider the optimization problem
We want to check the point x* = [1, 0]T.
Can we apply the KT necessity theorem?
We have to check first the constraint qualification.
At x*, only g1 and g3 are binding, therefore I = {1,3}.
g1(x*) = [0, -1]T, g3(x*) = [0, 1]T
Since [0, -1]T+[0, 1]T = [0, 0]T, then g1 and g2 are not linearly independent and hence the constraint qualification does not hold and we cannot use the KT necessary theorem.
Slide 53:Example Consider the optimization problem
Apply the KT necessity theorem to check the optimality of x = [0, 0]T.
Solution
We have one constraint, the constraint qualification holds.
For x = [0, 0]T and u1 = 2, the KT conditions are satisfied.
Therefore, we cannot make a definite decision on the optimality of x.
Slide 54:Kuhn-Tucker Sufficiency Theorem Consider the NLP problem
Let the objective function f(x) be convex, the inequality constraints gj(x) be all concave function for j = 1, …, J, and the equality constraints hk(x) for k =1, …, K be linear. If there exists a solution (x*, u*, *) that satisfies the Kuhn-Tucker conditions, then x* is an optimal solution to the NLP problem.
Slide 55:Example Consider the optimization problem
We want to prove the point x* = [1, 5]T is optimal.
f(x) is convex because Hf is positive semidefinite.
g1(x) is linear, hence it is both convex and concave.
g2(x) is concave because Hg2 is negative definite.
h1 is linear.
The conditions of Theorem 5.2 are satisfied.
Slide 56:The KT conditions are Substitute in the value of x*.
Slide 57:The KT conditions become For 1 = 1, u = [1.4, 0.2]. Hence, x* solves the KT problem.
Therefore, x* is an optimal solution.
Slide 58:SADDLE-POINT CONDITIONS In the previous discussion, the assumption was that the objective function and constraints are differentiable.
There are constrained optimality criteria for non-differentiable functions.
Slide 59:Definition
A function f(x,y) is said to have a saddle point at (x*, y*) if f(x*, y) ≤ f(x*, y*) ≤ f(x, y*) for all x and y.
Slide 60:The definition implies that x* minimizes f(x, y*) for all x and y* maximizes f(x*, y) for all y.
Consider the NLP problem
If x* is the optimal solution and u*≥ 0 is the corresponding optimal Lagrange multiplier, then (x*, u*) is a saddle point for the Lagrange function
Slide 61:Theorem 5.3 Sufficient Optimality Theorem If (x*, u*) is a saddle point of L(x, u), then x* is an optimal solution to the NLP problem.
Slide 62:Theorem 5.5 A solution (x*, u*) with u* ≥ 0 and x* S is a saddle point of L(x, u) if and only if the following conditions are satisfied:
x* minimizes L(x, u*) for x S
gj(x*) ≥ 0 for j = 1, …, J
ujgj(x*) = 0 for j = 1, …, J
Slide 63:Theorem 5.6 Second-Order Necessity Theorem Consider the NLP problem
The functions f, gj, and hk are twice-differentiable. Let the set of active constraints at x* be I. Assume that gj(x*) for jI and hk(x*) for k = 1, 2, …, K are linearly independent. Then, the necessary conditions that x* is a local minimum to the NLP are that SECOND-ORDER OPTIMALITY CONDITIONS
Slide 64:there exists (u*, v*) such that (x*, u*, v*) is a KT point,
for every nonzero vector y(N1) satisfying the following three equations
it follows that
Slide 65:Example 5.7 (Modified) Consider the optimization problem
Apply the Second order KT necessity theorem to check the optimality of x* = [0, 0]T.
Solution
We have one active constraint, the constraint qualification holds.
For x* = [0, 0]T and u1 = 2, the KT conditions are satisfied.
Find the Hessian of L:
Slide 66:For x* = [0, 0]T and u1 = 2,
We have to verify
for y satisfying
Verify the condition:
Hence, x* cannot be optimal.
Slide 67:Theorem 5.7 Second-Order Sufficiency Theorem Sufficient conditions that a point x* is a strict local minimum of the NLP
where the functions f, gj, and hk are twice-differentiable are that
there exists (u*, v*) such that (x*, u*, v*) is a KT point,
for every nonzero vector y(N1) satisfying the following four equations
Slide 68:it follows that