SINGLE VARIABLE OPTIMIZATION AND MULTI VARIABLE OPTIMIZATIUON.pptx
glorypreciousj
3,335 views
73 slides
Feb 27, 2024
Slide 1 of 73
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
About This Presentation
OPTIMIZATION
Size: 2.3 MB
Language: en
Added: Feb 27, 2024
Slides: 73 pages
Slide Content
Chapter 1 SINGLE VARIABLE OPTIMIZATION
1.1 INTRODUCTION TO OPTIMIZATION Optimization is the act of obtaining the best result under given circumstances. The ultimate goal of all such decisions is either to minimize the cost or to maximize the desired benefit. Since the cost required or the benefit desired can be expressed as a function of certain decision variables, optimization can be defined as the process of finding the conditions that give the maximum or minimum value of a function.
It can be seen from Fig. 1.1 that if a point x * corresponds to the minimum value of function f (x ), the same point also corresponds to the maximum value of the negative of the function, -f(x ). Thus, without loss of generality, optimization can be taken to mean minimization.
There is no single method available for solving all optimization problems efficiently. Hence a number of optimization methods have been developed for solving different types of optimization problems. The optimum seeking methods are also known as mathematical programming techniques and are generally studied as a part of operations research. Operations research is a branch of mathematics concerned with the application of scientific methods and techniques to decision making problems and with establishing the best or optimal solutions. Table 1.1 lists various mathematical programming techniques together with other well-defined areas of operations research . The classification given in Table 1.1 is not unique; it is given mainly for convenience.
Statistical Methods Stochastic Process Techniques Mathematical Programming Techniques - Regression analysis - Cluster analysis, pattern recognition - Design of experiments - Discriminate analysis (factor analysis) - Statistical decision theory - Markov processes - Queueing theory - Renewal theory - Simulation methods Reliability theory Stochastic programming - Calculus methods - Calculus of variations - Nonlinear programming - Geometric programming - Quadratic programming - Linear programming (LP) - Special Methods of LP - Integer programming - Dynamic programming - Separable programming - Multiobjective programming - Goal programming - Network methods: CPM-PERT - Game theory - Simulated annealing - Genetic algorithms Neural networks Inventory control TABLE 1.1 Methods of Operations Research
Methods of Operations Research Mathematical programming techniques are useful in finding the minimum of a function of several variables under a prescribed set of constraints. Stochastic process techniques can be used to analyze problems described by a set of random variables having known probability distributions. Statistical methods enable one to analyze the experimental data and build empirical models to obtain the most accurate representation of the physical situation.
1.2 STATEMENT OF AN OPTIMIZATION PROBLEM An optimization or a mathematical programming problem can be stated as follows. where X is an n-dimensional vector called the decision variables , f (X ) is termed the objective function , and g j (X) and l j (X) are known as inequality and equality constraints , respectively. The number of variables n and the number of constraints m and/ orp need not be related in any way. subject to the constraints …. (1.1)
Some optimization problems do not involve any constraints and can be stated as: Such problems are called unconstrained optimization problems.
1.2.1 Decision Variables Any system or component is defined by a set of quantities some of which are viewed as variables during the design process. In general, certain quantities are usually fixed at the outset and these are called preassigned parameters . All the other quantities are treated as variables in the design process and are called decision variables x i , i = 1,2,. . .,n. The decision variables are collectively represented as a decision vector
1.2.2 Constraints In many practical problems, the decision variables cannot be chosen arbitrarily; rather , they have to satisfy certain specified functional and other requirements. The restrictions that must be satisfied to produce an acceptable design are collectively called constraints . Constraints that represent limitations on the behavior or performance of the system are termed behavior or functional constraints . Constraints that represent physical limitations on design variables such as availability, fabricability , and transportability are known as geometric or side constraints .
1.2.3 Constraint Surface For illustration, consider an optimization problem with only inequality constraints g j (X) < 0. The set of values of X that satisfy the equation g j (X) = 0 forms a hypersurface in the design space and is called a constraint surface. Note that this is an (n-l)-dimensional subspace, where n is the number of design variables. The constraint surface divides the design space into two regions: one in which g j (X) < 0 and the other in which g j (X) > 0. Thus the points lying on the hypersurface will satisfy the constraint g j (X) critically, whereas the points lying in the region where g j (X) > 0 are infeasible or unacceptable, and the points lying in the region where g j (X) < 0 are feasible or acceptable . The collection of all the constraint surfaces g j (X) = 0, j = 1,2 ,. . .,m, which separates the acceptable region is called the composite constraint surface .
Figure 1.2 shows a hypothetical two-dimensional design space where the infeasible region is indicated by hatched lines. A design point that lies on one or more than one constraint surface is called a bound point , and the associated constraint is called an active constraint . Design points that do not lie on any constraint surface are known as free points . Depending on whether a particular design point belongs to the acceptable or unacceptable region, it can be iden tified as one of the following four types: 1. Free and acceptable point 2. Free and unacceptable point 3. Bound and acceptable point 4. Bound and unacceptable point All four types of points are shown in Fig. 1.2.
Figure 1.2: Constraint surfaces in a hypothetical two-dimensional design space.
1.2.4 Objective Function In general, there will be more than one acceptable design, and the purpose of optimization is to choose the best one. Thus a criterion has to be chosen for comparing the different alternatives. The criterion when expressed as a function of the design variables, is known as the criterion or merit or objective function . In engineering designs , the objective is usually taken as the minimization of cost. The maximization of profit or efficiency is the obvious choice.
In some situations, there may be more than one criterion to be satisfied simultaneously. An optimization problem involving multiple objective functions is known as a multiobjective programming problem . With multiple objectives there arises a possibility of conflict, and one simple way to handle the problem is to construct an overall objective function as a linear combination of the conflicting multiple objective functions.
1.2.5 Objective Function Surfaces The locus of all points satisfying f ( X) = c = constant forms a hypersurface in the design space, and for each value of c there corresponds a different member of a family of surfaces. These surfaces, called objective function surfaces, are shown in a hypothetical two-dimensional design space in Fig. 1.3. Once the objective function surfaces are drawn along with the constraint surfaces , the optimum point can be determined without much difficulty. But the main problem is that as the number of design variables exceeds two or three , the constraint and objective function surfaces become complex even for visualization and the problem has to be solved purely as a mathematical problem .
Figure 1.3: Contours of the objective function.
1.4 Nonlinear Programming Problem If any of the functions among the objective and constraint functions in Eq. (1.1) is nonlinear, the problem is called a nonlinear programming (NLP) problem. This is the most general programming problem and all other problems can be considered as special cases of the NLP problem.
1.5 CLASSICAL OPTIMIZATION THEORY The classical methods of optimization are useful in finding the optimum solution of continuous and differentiate functions. These methods are analytical and make use of the techniques of differential calculus in locating the optimum points . The methods may not be suitable for efficient numerical computations. Moreover, since some of the practical problems involve objective functions that are not continuous and/or differentiate, the classical optimization techniques have limited scope in practical applications. However, a study of the calculus methods of optimization forms a basis for developing most of the numerical techniques of optimization.
1.6.1 SINGLE-VARIABLE OPTIMIZATION WITH NO CONSTRAINTS A function of one variable f(x) is said to have a relative or local minimum at x = x* if f (x*) f (x* + h) for all sufficiently small positive and negative values of h . Similarly, a point x* is called a relative or local maximum if f (x*) ≥ f (x* + h) for all values of h sufficiently close to zero. A function f (x) is said to have a global or absolute minimum at x* if f (x*) f(x) for all x, and not just for all x close to x*, in the domain over which f (x) is defined. Similarly, a point x* will be a global maximum of f(x) if f (x*) ≥ f (x) for all x in the domain. 1.6 UNCONSTRAINED PROBLEMS
Figure 1.5 Relative and global minima. Figure 1.5 shows the difference between the relative (local) and global optimum points.
A single-variable optimization problem is one in which the value of x = x* is to be found in the interval [ a,b ] such that x* minimizes f ( x ). The following two theorems provide the necessary and sufficient conditions for the relative (local) minimum of a function of a single variable.
Theorem 1.1: Necessary Condition If a function f (x ) is defined in the interval a < x < b and has a relative minimum at x = x*, where a < x* < b, and if the derivative df (x) ldx = f '(x) = exists as a finite number at x = x*, then f ‘(x*) = x * = 0 Symbol Symbol Name Meaning / definition ∇ nabla gradient
Notes: 1 . This theorem can be proved even if x* is a relative maximum. 2. The theorem does not say what happens if a minimum or maximum occurs at a point x* where the derivative fails to exist. For example, in Figure 1.6, If f '(x*) does not exist, the theorem is not applicable .
Figure 1.6 Derivative undefined at x*.
3. The theorem does not say that the function necessarily will have a minimum or maximum at every point where the derivative is zero. For example, the derivative f '(x) = 0 at x = 0 for the function shown in Figure 1.7. However, this point is neither a minimum nor a maximum. In general, a point x* at which f '(x*) = 0 is called a stationary (inflection) point .
Figure 1.7 Stationary (inflection) point.
Theorem 2.2: Sufficient Condition Let f '(x*) = f "(x*) = • • • = f (n-1) (x*) = 0, but f (n) (x*) ≠ 0. Then f (x*) is: a minimum value of f (x) if f ( n ) (x*) > 0 and n is even; a maximum value of f (x) if f (n) (x*) < 0 and n is even ; ( iii) neither a maximum nor a minimum if n is odd. In this case the point x* is called a point of inflection (stationary point). .
Determine the maximum and minimum values of the function f ( x ) = 12 x 5 − 45 x 4 + 40 x 3 + 5 Solution Taking the first derivative of the function yields to ... f ′( x ) = 60 x 4 −180 x 3 + 120 x 2 = 60x 2 (x 2 - 3x +2) = = 60x 2 (x – 2))(x – 1) The value of the function at x = 0 x =1 x = 2 is zero. In the next step take the second derivative of the function f ′’( x ) = 240 x 3 – 540x 2 + 240x - At x =0: f ′’( x ) = 0, f ′’’( x )= 720x 2 – 1080x +240 = 240 inflection point. At x =1: f ′’( x ) = -60 relative (local) maximum f(1)= 12. At x =2: f ′’( x ) = 240 relative (local) minimum f(2)= -11. Example 1
Second Order Equations f ( x ) =
Example 2 Determine the maximum and minimum values of the function: f ( x ) = 10 x 6 − 48 x 5 + 15 x 4 + 200 x 3 − 120 x 2 + − 480x + 100 Solution: f ′ ( x ) = 60 x 5 − 240 x 4 + 60 x 3 + 600 x 2 − 240x − 480 = 0 x 5 − 4 x 4 + x 3 + 10 x 2 − 4x − 8 = 0 Solving using the Polynomial Roots Calculator
Then f ’(x) = 0 at x = -1 and 2 f ’’(x) = 5 x 4 − 16 x 3 + 3 x 2 + 20x − 4 f ”(-1) = 0 f’’’ (x) = 20 x 3 − 48 x 2 + 6x + 20 f’’’ (-1) = − 20 − 48 − 6 + 20 = -54 Then point x=-1 is an inflection point. f ”(2) = 0 f’’’ (2) = f (4) (x) = 60 x 2 − 96x + 6 f (4) (2) = 240 −192 + 6 = 54 Then point x=2 is a relative minimum point . f max = f (2) = - 396. f ′ ( x ) = x 5 − 4 x 4 + x 3 + 10 x 2 − 4x − 8 =
1.6.2 MULTIVARIABLE OPTIMIZATION WITH NO CONSTRAINTS Theorem 2.3: Necessary Condition Theorem 2.4: Sufficient Condition A sufficient condition for a stationary point X* to be an extreme point is that the matrix of second partial derivatives ( Hessian matrix ) of f (X) evaluated at X* is: positive definite when X* is a relative minimum point, (ii) negative definite when X* is a relative maximum point. If f (X ) has an extreme point ( maximum or minimum) at X = X* and if the first partial derivatives of f ( X) exist at X*, then: ) = 0,
Hessian matrix The Hessian matrix (or simply the Hessian ) is the square matrix of second-order partial derivatives of a function . The Hessian matrix was developed in the 19th century by the Ludwig Otto Hesse and later named after him. Given the real-valued function if all second partial derivatives of f exist, then the Hessian matrix of f is the matrix where x = ( x 1 , x 2 , ..., x n ) and D i is the differentiation operator with respect to the i th argument and the Hessian becomes:
A test that can be used to find the positive definiteness of a matrix A of order n involves evaluation of the determinants The matrix A will be positive definite if and only if all the values A1, A2, A3 , . . . , An are positive . If some of the A j are positive and the remaining A j are zero, the matrix A will be positive semidefinite . The matrix A will be negative definite if and only if the sign of A j is (-1) j for j = 1,2,. . ., n (i.e. A 1 is negative, A 2 is positive, A 3 is negative,….. ). 1
Depending on the second derivative matrix Hf ( a , b ) , the graph of f ( x , y ) might look like an elliptic paraboloid pointing upward, centered at the point ( a , b ) (shown by the blue dot, below). In this case, we say that Hf ) a , b ) is positive definite , and f has a local minimum at ( a , b . ) . Figure 1.8
Alternatively, the graph of f ( x , y ) might look like an elliptic paraboloid pointing downward, centered at the point ( a , b ) shown by the red dot, below). In this case, we say that Hf ( a , b ) is negative definite , and f has a local maximum at ( a , b .( Figure 1.9
Saddle Point In the case of a function of two variables, f( x,y ), the Hessian matrix may be neither positive nor negative definite at a point (x*,y*) at which In such a case, the point (x*, y*) is called a saddle point . The characteristic of a saddle point is that it corresponds to a relative minimum or maximum of f( x,y ) with respect to one variable, say, x (the other variable being fixed at y= (y *) and a relative maximum or minimum of f( x,y ) with respect to the second variable say, y (the other variable being fixed at (x*).
Figure 1.10
f ( x,y *) = f (x,0) has a relative minimum and f (x*,y) = f (0,y) has a relative maximum at the saddle point (x *,y *). Figure 1.11: Saddle point of the function f( x,y ) = x 2 — y 2 .
Example 1 consider the function: f( x,y ) = x 2 — y 2 The Hessian matrix of f at (x*,y*) = H Since this matrix is neither positive definite nor negative definite, the point (x* = 0, y* = 0) is a saddle point . The determinant H1 = 2 (positive), and the determinant H2 = 2(-2) -0(0) = -4 (negative). Then H is indefinite. These first derivatives are zero at x* = 0 and y* = 0. The Hessian matrix of f is: 2 0 0 -2 H= =
Example 2 Find the critical points of the function: SOLUTION : The necessary conditions for the existence of an extreme point are: From (1) x 1 = 0 or (- ), and from (2) x 2 = 0 or ( - . Then these equations are satisfied at the points: (0,0), (0 ,- ), (- , 0), and ( - , - ) …. (1) … (2)
To find the nature of these extreme points, we have to use the sufficiency conditions. The second-order partial derivatives of f are given by: The Hessian matrix of f is given by: H
H
1.7.1 MULTIVARIABLE OPTIMIZATION WITH EQUALITY CONSTRAINTS We consider the optimization of continuous functions subjected to equality constraints: Where: Here m is less than or equal to n; otherwise (if m > n), the problem becomes overdefined and, in general, there will be no solution. There are several methods available for the solution of this problem: The Constrained variation , Jacobian method , Methods of direct substitution , and Lagrange multipliers . 1.7 CONSTRAINED PROBLEMS
1.7.1.1 Method of Direct Substitution For a problem with n variables and m equality constraints, it is theoretically possible to solve simultaneously the m equality constraints and express any set of m variables in terms of the remaining n — m variables. When these expressions are substituted into the original objective function, there results a new objective function involving only n — m variables. The new objective function is not subjected to any constraint, and hence its optimum can be found by using the unconstrained optimization techniques.
Example 1 Either x 1 or x 2 can be eliminated without difficulty. Solving for x 1 , Substitute for x 1 in the Objective Function, the new equivalent objective function in terms of a single variable x 2 is: The constraint in the original problem has now been eliminated, and f (x 2 ) is an unconstrained function with one independent variable. (1)
We can now minimize the new objective function by setting the first derivative of f equal to zero, and solving for the optimal value of x 2 : f” (x) = 28 (positive), then X* is a local minimum. Once x 2 * is obtained, then, x 1 * can be directly obtained via the relation (1): , then:
Solving Using Excel F4: =SUMPRODUCT(B4:C4;B3:C3) F6: = SUMPRODUCT(D6:E6;D3:E3) Problem:
Example 2 The profit analysis model: Max the profit z = vp – c f – vc v …….….. (1) The demand is represented by: v = 1,500 – 24.6p ………... (2) Where: v = volume (quantity), p = price, c f = fixed cost = $ 10,000, c v = variable cost = $8 per unit. Substituting values of c f and c v into (1), we obtain: z = v p -10,000 – 8 v …..…… (3) Substituting (2) in (3): z = 1500p – 24.6p 2 – 10,000 – 8(1,500 – 24.6p) z = 1696.8p -24.6p 2 -22,000 …..……. (4) = 1696.8p -49.2p = 0 for the critical points, then: p* = 34.49 = - 49.2 (negative), then p* is a local maximum. Substituting in (2): v* = 1500 – 24.6(34.49) = 651.55 Substituting in (3): z max = (651.55)( 34.49) – 10,000 – 8(651.55) = 7259.56
1.7.1.2 Lagrange Method The basic features of the Lagrange multiplier method is given initially for a simple problem of two variables with one constraint . The extension of the method to a general problem of n variables with m constraints is given later.
Problem with Two Variables and One Constraint. Consider the problem: subject to: λ is called the Lagrange multiplier . Define Lagrange function: L is treated as a function of the three variables x 1 , x 2 , and λ . Necessary Conditions for Extremum :
Theorem: Sufficient Condition A sufficient condition for f (X) to have a relative minimum at X* is that the quadratic, Q, defined by: evaluated at X = X* must be positive definite for all values of dX for which the constraints are satisfied . If Q is negative definite for all choices of the admissible variations dX , X* will be a constrained maximum of f (X). - It has been shown by Hancock that a necessary condition for the quadratic form Q, to be positive ( negative ) definite for all admissible variations d X is that each root of the polynomial Z i , defined by the following determinantal equation, be positive ( negative ): Q = d d
Example Find the solution of the following problem using the Lagrange multiplier method: f ( x,y ) = x -1 y -2 Subject to: g( x,y ) = x 2 + y 2 - 4 = 0 The Lagrange function is: L( x,y , λ ) = f ( x,y ) + λ g( x,y ) = x -1 y -2 + λ ( x 2 + y 2 - 4 ) The necessary conditions for the extreme of f(x, y) give: = -x -2 y -2 +2 λ x = 0 ……. (1) = -2x -1 y -3 +2 λ y = 0 ……. (2) = x 2 +y 2 -4 = 0 ……. (3) λ = x -3 y -2 ……… (4) λ = x -1 y -4 ……… (5) x -3 y -2 = x -1 y -4
x -3 y -2 = x -1 y -4 y -2 = x 2 y -4 y 2 = x 2 X* = y* Or: x 2 = y 2 ………… (6) (3): x 2 +y 2 - 4 = 0 From (6): y 2 +y 2 = 4 y 2 = y* = 2 Substituting in (6): x* = (5): λ = x -1 y -4 λ * = =
Necessary Conditions for a General Problem The equations can be extended to the case of a general problem with n variables and m equality constraints : subject to: The Lagrange function, L, in this case is defined by introducing one Lagrange multiplier λ j for each constraint g j (X ) as: The necessary conditions for the extremum of L, are given by: = f (x 1 , x 2 , ……, x n )
Sufficient Condition A sufficient condition for f (X ) to have relative minimum at X* is that the quadratic, Q , defined by: evaluated at X = X* must be positive definite for all values of variations dX for which the constraints are satisfied. If Q is negative definite for all choices of the admissible variations dx i , X* will be a constrained maximum of f (X ). It has been shown by Hancock that a necessary condition for the quadratic form Q, to be positive (negative) definite for all admissible variations dX is that each root of the polynomial Zi , defined by the following determinantal equation, be positive (negative):
Where: This equation on expansion, leads to an (n — m) th -order polynomial in z. If some of the roots of this polynomial are positive while the others are negative, the point X* is not an extreme point. ,
1.7.2 MULTIVARIABLE OPTIMIZATION WITH INEQUALITY CONSTRAINTS Consider the following problem: Minimize f (X) subject to: g j (X ) ≤ 0, j = 1,2,. . ., m Kuhn-Tucker Conditions The conditions to be satisfied at a constrained minimum point, X*. These conditions are, in general, not sufficient to ensure a relative minimum . However, there is a class of problems, called convex programming problems for which the Kuhn-Tucker conditions are necessary and sufficient for a global minimum.
Kuhn-Tucker Conditions The Kuhn-Tucker conditions can be stated as follows: Note that if the problem is one of maximization or if the constraints are of the type g j ≥ 0, the λ j have to be nonpositive . On the other hand, if the problem is one of maximization with constraints in the form g j ≥ 0 , the λ j have to be nonnegative.
The Kuhn-Tucker conditions are stated as follows: For the following problem: Minimize f (X) subject to: g j (X) ≤ 0, j = 1,2,. . .,m Note that if the problem is one of maximization or if the constraints are of the type g j ≥ 0, the λ j have to be nonpositive . On the other hand, if the problem is one of maximization with constraints in the form g j ≥ 0, the λ j have to be nonnegative.