Ekeeda - Electrical and Electronics Engineering - Function of Variable

Views:
 
Category: Education
     
 

Presentation Description

Electrical and Electronics Engineering is a branch of science which deals with the applications of electricity, electronics, and electromagnetism. The course diversifies into various fields like the multimedia programmer, technical sales engineer, and project manager. Ekeeda offers Online Electrical and Electronics Engineering Courses for all the Subjects as per the Syllabus.

Comments

Presentation Transcript

slide 1:

EKEEDA - ELECTRICAL AND ELECTRONICS ENGINEERING → → 1 Functions of several variables Definition In the previous chapter we studied paths -2/ which are functions R R n . We saw a path in R n can be represented by a vector of n real-valued functions. In this chapter we consider functions R n R i.e. functions whose input is an ordered set of n numbers and whose output is a single real number. In the next chapter we will generalize both topics and consider functions that take a vector with n components and return a vector with m components. Example 9.1 Consider the function f ∶ R 2 → R defined by f x y x 2 + y sin x. We may as well write f ∶ st s 2 + t sin s but since the number of arguments can be larger it is more systematic to use the vector notation f x f x 1 x 2 x 2 + x 2 sin x 1. 1 Image of Joseph-Louis Lagrange 1736–1813

slide 2:

EKEEDA - ELECTRICAL AND ELECTRONICS ENGINEERING ∈ + + ⊂ 4 → ∈ ∶ → ∶ ⋅ ⋅ + + + + ∈ ∈ x Example 9.2 Like for the case n 1 the domain of a function may be a subset of R n . For example let D x y z R 3 x 2 y 2 z 2 1 R 3 or in vector notation Define g ∶ D → R by D x ∈ R 3 x 1 ⊂ R 3 . gx ln1 −x 2 or gx y z ln1 − x 2 − y 2 − z 2 . Then for example 12 12 12 ∈ D and g12 12 12 ln 1 . Comment 9.1 It is very common to denote the arguments of a bi-variate function by x y and of a tri-variate function by x y z . We will often do so but remember that there is nothing special about these choices. Example 9.3 Important example Recall that if V and W are vector spaces we denote by L VW the space of linear functions V W . Let a R n . Then the function f R n R defined by f x a x is linear i.e. belongs to L R n R in the sense that for every x y R n and a b R f ax b y a f x b f y . For example taking n 3 and a 1 2 3 f x y z 1 2 3 x y z x 2y 3z is a linear function. In fact all linear functions in L R n R are of this form. They are characterized by a vector a that scalar-multiplies their argument. Example 9.4 The function f x y tan − 1 y can be defined everywhere where x ≠ 0 i.e. its maximal domain is x y ∈ R 2 x ≠ 0.

slide 3:

EKEEDA - ELECTRICAL AND ELECTRONICS ENGINEERING n0 → ∈ ∶ → ∀ ∈ ∃ ∈ ∈ × ∈ ∈ ⊂ ∶ → × ⊂ × Example 9.5 The maximal domain of the function f x y lnx − y is x y ∈ R 2 x y. Example 9.6 The maximal domain of the function f x y lnx − y is x y ∈ R 2 x ≠ y. Example 9.7 The maximal domain of the function f x y cos − 1 y + x is x y ∈ R 2 − 1 ≤ y ≤ 1. Example 9.8 The maximal domain of the function f x y lnsinx 2 + y 2 is ∞ x y ∈ R 2 2np x 2 + y 2 2n + 1p. The graph of a function R n R Definition Recall that for every two sets A and B the graph Graph f of a function f A B is a subset of the Cartesian product A B with the condition that a A b B such that a b Graph f . The value returned by f f a is the unique b B that pairs with a in the graph set. In other words Graph f a b ∈ A × B b f a. Thus if D R n and f D R the graph of f is a subset of D R R n R R n+1 Graph f x 1 x 2... x n z x 1... x n D z f x 1... x n . For the particular case of n 2 Graph f x y z x y D z f x y which is a surface in R 3 .

slide 4:

EKEEDA - ELECTRICAL AND ELECTRONICS ENGINEERING ⊂ ∶ → 0 Example 9.9 The graph of the function f x y x 2 y 2 whose domain of defini- tion is the whole plane is a paraboloid Graph f x y z ∈ R 3 z x 2 + y 2 . Example 9.10 The graph of any function of the form f x y hx 2 + y 2 where h ∶ R → R is a surface of revolution. Slices of graphs Consider a function f R 2 R whose graph is Graph f x y z z f x y R 3 . We obtain slices . of the graph by intersecting it with planes. For example the intersection of this graph with the plane “x x 0" x y z ∈ R 3 x x 0 is x 0 y z z f x 0 y which is the graph of a function of one variable z as function of y for fixed x x . Similarly the intersection of the graph of f with the plane y y 0 is x y 0 z z f x y 0 which is also the graph of a function of one variable z as function of x for fixed y y 0. Thus the fact that f is a function implies that both y f x 0 y and x f x y 0 are also functions of one variable. Finally the intersection of the graph of f with the plane “z z 0" x y z ∈ R 3 z z 0 is the set x y z 0 z 0 f x y. This set is called a contour line " 8 of f . It is a subset of space parallel to the xy-plane it could be empty it could be a closed curve or a more complicated domain even the whole of R 2 . The important observation is that in general a contour line is not the graph of a function.

slide 5:

EKEEDA - ELECTRICAL AND ELECTRONICS ENGINEERING + + Example 9.11 Consider the function f ∶ R 2 → R f x y x 2 + y 2 . Its graph is It is a paraboloid. 2 Graph f x y z z x 2 + y 2 . f x y x 2 + y 2 1 0 −1 −0.5 0 1 0 0.5 1 −1 The intersection of its graph with the plane x a is a y z z a y 2 which is a parabola on a plane parallel to the yz-plane. The intersection of its graph with the plane z R is x y R R x 2 + y 2 which is empty for R 0 a point for R 0 and a circle in a plane parallel to the xy-plane for R 0. Example 9.12 Consider a linear function f ∶ R 2 → R f x y ax + by. Its graph is a plane in R 3 Graph f x y z z ax by . All its slices are lines e.g. x y z 0 z 0 ax + by.

slide 6:

EKEEDA - ELECTRICAL AND ELECTRONICS ENGINEERING ∶ → R ⊂ R ... ... 1 n 1 n − ∶ → → x − a 1 1 1 2 is continuous at a 1 2 because for every e 0 we can find a neighborhood of ∂x 1 ∂x Continuity Let f D where D n . Loosely speaking f is continuous at a point a a a if small deviations of x x x about a imply small changes in f . More formally f is continuous at a if for every e 0 there exists a neighborhood of a such that for every x is that neighborhood f x f a e. All the functions that we will meet in this chapter will be continuous in their domains of definition but be aware that there are many non-continuous functions. Example 9.13 The function f x x 2 + x 2 1 2 say D x ∈ R 2 x 1 − 1 2 +x 2 − 2 2 d such that for all x ∈ D f x − 5 e. Directional derivatives Consider a function f R 2 R in the vicinity of a point a a 1 a 2 . Like for functions in R we often ask at what rate does f change when we slightly modify its argument. The difference is that here it has two arguments that can be modified independently. One possibility is to keep say the second argument x 2 fixed at a 2 and evaluate f as we vary the first argument x 1 from the value a 1. By fixing the value x 2 a 2 we are in fact considering a function of one variable x f x a 2 . We could ask then about the rate of change of f as we vary x 1 near a 1 with x 2 a 2 fixed lim x 1 a 1 f x 1 a 2 − f a 1 a 2 . If this limit exists we call it the partial derivative 8- 91 of f along its first argument or in the x-direction at the point a. We may denote it by the following alternative notations D 1 f a ∂ f a or ∂ f a.

slide 7:

EKEEDA - ELECTRICAL AND ELECTRONICS ENGINEERING with the choice v ˆ e ˆ j. ∂ f a lim f a 1 + h a 2 − f a 1 a 2 . ∂x 2 ∂y D j f a ∂ f a lim f a + h e ˆ j − f a ∂ v ˆ h →0 h Equivalently Similarly if the limit ∂x 1 h →0 h lim f a 1 x 2 − f a 1 a 2 lim f a 1 a 2 + h − f a 1 a 2 . x 2 →a 2 x 2 − a 2 h →0 h exists we call it the partial derivative of f along the second coordinate or in the y-direction and denote it by D 2 f a ∂ f a or ∂ f a More generally if f ∶ R n → R and a ∈ R n then assuming that the limit exists ∂x j where e ˆ j is the j-th unit vector in R n . h →0 h Comment 9.2 The symbol ∂ for partial derivatives is due to the Marquis de Con- dorcet in 1770 a French philosopher mathematician and political scientist. Partial derivatives quantify the rate at which a function changes when moving away from a point in very specific directions along the unit vectors e ˆ j. More generally we could evaluate the rate of change of a function along any unit vector v ˆ in R n D v ˆ f a ∂ f a lim f a + h v ˆ − f a . Such a derivative is called a directional derivative 1 91. It is the rate of change of f at a along the direction v ˆ. Note that D v ˆ f a is the derivative of the function t f a + t v ˆ at the point t 0. Partial derivatives are particular cases of directional derivatives Example 9.14 Consider the function f ∶ R 2 → R f x y x 2 y

slide 8:

EKEEDA - ELECTRICAL AND ELECTRONICS ENGINEERING 0 0 ∂y 0 ∂x ∂y f x y x 2 y 1 0 1 −1 −0.5 0 1 0 0.5 1 −1 The partial derivative of f in the x-direction at a point x 0 y 0 is the derivative of the function at x x 0 namely x x 2 y 0 ∂ D 1 f x 0 y 0 ∂x x 0 y 0 2x 0y 0. The partial derivative of f in the y-direction at a point x 0 y 0 is the derivative of the function at y y 0 namely y x 2 y ∂ D 2 f x 0 y 0 x 0 y 0 x . Take now any unit vector v ˆ cos q sin q . The directional derivative of f in this direction at a point x 0 y 0 is the derivative of the function t f x 0 + t cos q y 0 + t sin q x 0 + t cos q 2 y 0 + t sin q at t 0 i.e. This means that D vˆ f x 0 y 0 2x 0y 0 cos q + x 2 sin q. D v ˆ f x 0 y 0 cos q ∂ f x 0 y 0+ sin q ∂ f x 0 y 0. We will soon see that this is a general result. The derivative of f along any direction can be inferred from its partial derivatives this is not a trivial statement. f 2 f −

slide 9:

EKEEDA - ELECTRICAL AND ELECTRONICS ENGINEERING → Example 9.15 Consider the function f x y x 2 + y 2 whose graph is a cone. Its partial derivatives at x y are ∂x x y x 2 + y 2 and ∂y x y x 2 + y 2 which holds everywhere except for the origin where the function is continuous but not differentiable. Taking an arbitrary direction v ˆ cos q sin q D vˆ f x y F ′ 0 where Thus Ft f x + t cos q y + t sin q x + t cos q 2 +y + t sin q 2 . D v ˆ f x y x co s q + y sin q cos q ∂ f x y + sin q ∂ f x y. x 2 + y 2 ∂x ∂y f x y x 2 + y 2 1 0 −1 −0.5 0 9.5 Differentiability 1 0 0.5 1 −1 We have defined so far directional derivatives of functions R n R but we haven’t yet defined what is a differentiable function nor we defined what is the derivative of a multivariate function. Naively one could define f ∶ R n → R to be differentiable ∂ f x ∂ f y

slide 10:

EKEEDA - ELECTRICAL AND ELECTRONICS ENGINEERING differentiable at a if when x is close to a i.e. when x − a is small D f f x − f a is approximately a linear function of Dx x − a. Definition 9.1 f ∶ R n → R is differentiable at a if there exists a vector L such that In other words f x − f a L ⋅ x − a + ox − a. − x a lim → 0 f x − f a − L ⋅ x − a x − a 0 . R → R ∶ → → h ... h 1 2 ∶ R → R − ∶ → → if all its partial derivatives exist. This requirement turns out not to be sufficiently stringent. The differentiability of a function n generalizes the particular of case n 1. For n 1 there are several equivalent definitions of differentiability: f R R is differentiable at a if lim f a + h − f a exists. h →0 h Let’s try to generalize this definition. Since f takes for input a vector it would be differentiable at a if lim f a + h − f a exists h →0 h The numerator is well-defined the limit h 0 means that every component of h tends to zero but division by h is not defined. The fraction we’re trying to evaluate the limit of is f a 1 + h 1... a n + h n − f a 1... a n and this is not a well-formed expression. The differentiability of a function f can also be defined in an alternative way: f is differentiable at a if when x is close to a D f f x f a is approximately a linear function of Dx x − a. That is there exists a number L such that f x − f a ≈ L x − a in the sense that i.e. f x − f a L x − a + ox − a lim f x − f a − Lx − a 0. x →a x − a ′ If this is the case we call the number L the derivative of f at a and write f a L. The latter characterization of the derivative turns out to be the one we can generalize to multivariate functions. Let f R n R. Recall that linear functions R n R can be represented by a scalar multiplication by a constant vector. Thus f is

slide 11:

EKEEDA - ELECTRICAL AND ELECTRONICS ENGINEERING For f ∶ R n → R ∇ f ∶ R n → R n ∇ → − → − lim x − a We call the vector L the derivative of f at a or the gradient 19 of f at a and denote D f a L ∇ f a L. Comment 9.3 Note that the derivative of a multivariate function at a point is a vector. If we turn now the point a into a variable the gradient f is a function that takes a vector R n a point in the domain of f and returns a vector in R n the value of the gradient at that point namely Comment 9.4 The limit x a requires some clarification. It is required to exist regardless of how x approaches a. For any sequence x k satisfying x k a 0 we want the above ratio to tend to zero. Comment 9.5 The fact that f x f a is approximated by a linear function of x − x means that f a + Dx − f a ∇ f a 1 Dx 1 + ⋅ ⋅ ⋅ + ∇ f a n Dx n + oDx. The immediate question is what is the relation between the derivative or gradient of a multivariate function and its partial derivatives. The answer is the following: Proof. By definition x −a →0 f x − f a − ∇ f a ⋅ x − a 0 regardless of how x tends to a. Set now x a + h eˆ j and let h → 0. Since x − a h → 0 Proposition 9.1 Suppose that f ∶ R → R is differentiable at a with ∇ f n a . Then all its partial derivatives exist and ∂ f ∂x a j In other words ∇ f a . j ∂ f ∇ f a ⋮ . ∂x 1 a ∂ f a ∂x n or

slide 12:

EKEEDA - ELECTRICAL AND ELECTRONICS ENGINEERING √ 1 lim x.y h →0 h Example 9.16 Consider the function f ∶ R 2 → R To determine whether f is differentiable at 0 0 we have to check whether it follows that lim f a + h e ˆ j − f a − h ∇ f a j 0. By definition this means that the j-th partial derivative of f at exists and is equal to ∇ f a j. Take a 0 0. Then f x y √ 1 + x sin y. 0 0 sin y 0 and ∂ f 0 0 √ 1 + x cos y 1. ∂x 2 1 + x 00 ∂y 00 By Proposition 9.1 if f is differentiable at 0 0 then its gradient at 0 0 is a vector whose entries are the partial derivatives of f at that point ∇ f 0 0 0 . i.e. whether xy →0 f x y − f 0 0 − ∇ f 0 0 ⋅ x y 0 √ 1 + x sin y − y xy →0 x 2 + y 2 0. You can use say Taylor’s expansions to verify that this is indeed the case. Comment 9.6 Most functions that you will encounter are differentiable. In partic- ular sums products and compositions of differentiable functions are differentiable. Example 9.17 Consider again the function r ∶ R n → R rx x. Then x 1x ∇ f x ⋮ x ˆ. x n x Thus the gradient of the function that measures the length of a vector is the unit vector of that vector. We will discuss the meaning of the gradient more in depth below. It remains to establish the relation between the gradient and directional derivatives. lim ∂ f a

slide 13:

EKEEDA - ELECTRICAL AND ELECTRONICS ENGINEERING lim x − a ∇ ⋅ ∂ v h →0 h Proof. By definition since f is differentiable at a x −a →0 f x − f a − ∇ f a ⋅ x − a 0. Set now x a + h v ˆ and let h → 0. Then x − a h → 0 hence lim f a + h v ˆ − f a − h ∇ f a ⋅ v ˆ 0. h →0 h This precisely means that the directional derivative of f at a exists and is equal to f a v ˆ. Remains the following question: is it possible that a function has all its directional derivatives at a point and yet fails to be differentiable at that point The following example shows that the answer is positive proving that differentiability is more stringent than the mere existence of directional derivatives. Example 9.18 Consider the function f x y x 2 y 13 For every direction v ˆ cos q sin q ∂ f 0 0 lim f h cos q h sin q − f 0 0 cos 23 q sin 13 q so that all the directional derivatives exist. In particular setting q 0 p2 ∂x 0 0 0 and ∂y 0 0 0. Is f differentiable at 0 0 If it were then by Proposition 9.2 ∂ v 0 0 cos q ∂x 0 0+ sin q ∂y 0 0 which is not the case. Hence we deduce that f is not differentiable at 0 0. ∂ f ∂ f ∂ f ∂ f ∂ f

slide 14:

EKEEDA - ELECTRICAL AND ELECTRONICS ENGINEERING ∶ R → R → ○ ∶ → → → ○ xt xt t f x y x 2 y 13 1 0.5 0 −1 −0.5 0 0.5 1 1 0.5 Comment 9.7 If the partial derivatives are continuous in a neighborhood of a point then f is differentiable at that point. Composition of multivariate functions with paths Consider a function f n representing for example temperature as a function of position. Let x R R n be a path in R n representing for example the position of a fly as a function of time. If we compose these two functions f x we get a function R R the temperature measured by the fly as function of time. Indeed t x t f x t is a mapping R R n R. In component notation f x t f x t f x 1 t x 2 t ... x n t recall that a path can be identified with n real-valued functions. 2 Example 9.19 The trajectory of a fly in R is yt t 2 and the temperature as function of position on the plane is f x y y sin x + cos x.

slide 15:

EKEEDA - ELECTRICAL AND ELECTRONICS ENGINEERING Proposition 9.3 — Chain rule. Let x ∶ R → R n and f ∶ R n → R be differentiable at t 0 and at xt 0 respectively. Then f ○ x is differentiable at t 0 and d f dt ○ x t ∇ f 0 x t ⋅ x ˙ 0 t 0 . ○ dt ′ sin x 2t dt ∇ f xt ⋅ x ˙t t 2 cost −sint ⋅ 1 t 2 cost −sint + 2t s i n t Then the temperature measured by the fly as function of time is f ○ xt f xt t 2 sint + cost. The question is the following: suppose that we know the derivative of x at t i.e. we know the velocity of the path and we know the derivative of f at x t i.e. we know how the temperature changes in response to small changes in position at the current location. Can we deduce the derivative of f x at t i.e. can we deduce the rate of change of the temperature measured by the fly Let’s try to guess the answer. For univariate functions the derivative of a composition is the product of the derivatives the chain rule so we would guess d f ○ x t f ′ x t x ˙ t . This expression is meaningless. The derivative of f is a vector and so is x t . The left-hand side is a scalar hence an educated guess would be: Example 9.20 Let’s examine the above example in which so that ∇ f x y cos x −sin x and x ˙t 1 On the other hand sint 2t . d f ○ x t t 2 cost + 2t sint −sint i.e. Proposition 9.3 seems correct. Proof. Since x is differentiable at t 0 it follows that xt xt 0 + x ˙t 0t − t 0 + ot − t 0

slide 16:

EKEEDA - ELECTRICAL AND ELECTRONICS ENGINEERING ∈ 0 0 0 0 ∶ → + ∇ ⋅ + + √ √ ∂ v ˆ ∂x ∂y x 0 y 0 √ ∂G 2 √ 12 4 and ∂G 2 √ 12 2 √ 12 which we may also write as Dx x ˙ t 0 Dt o Dt . Since f is differentiable at x t 0 it follows that f x t f x t 0 f x t 0 Dx o Dx . Putting things together f xt f xt 0 + ∇ f xt 0 ⋅ x ˙t 0Dt + oDt which by definition implies the desired result. Differentiability and implicit functions Recall that a function x f x may be defined implicitly via a relation of the form G x f x 0 where G R 2 R. Another way to state it is that f is defined via its graph Graph f x y G x y 0 . Suppose that x y Graph f i.e. y f x . Consider now the directional derivatives of G along v ˆ cos q sin q at x 0 y 0 ∂ G x 0 y 0 v ˆ ⋅ ∇Gx 0 y 0 cos q ∂ G x 0 y 0 + sin q ∂ G x 0 y 0. The line tangent to the graph of f at x 0 y 0 is along the direction in which the directional derivative of G vanishes i.e. ∂G ∂G ∂G Thus ∂ v ˆ x 0 y 0 cos q ∂x x 0 y 0 + tan q ∂ y x 0 y 0 0. ′ ∂G x 0 y 0 f x 0 tan q − ∂x . Example 9.21 Let Gx y x 2 + y 2 − 16 and x 0 y 0 2 12. Then ∂x so that ∂y ′ 4 1 f 2 − 2 12 − 3 . ∂G ∂y

slide 17:

EKEEDA - ELECTRICAL AND ELECTRONICS ENGINEERING ∇ ∂ v ˆ ∇ ⋅ ∇ ∶ → ∂ v ˆ ∇ ∇ ∶ → ∈ gt f x 0 + t v ˆ ∂ f lim f x + hx ˆ − f x lim x + hx ˆ−x 1 . → Interpretation of the gradient We saw that for every function f R n R and unit vector v ˆ ∂ f a v ˆ f a . f a is a vector in R n . What is its direction What is its magnitude. If q is the angle between the gradient of f at a and v ˆ then ∂ f ∂ v ˆ a f a cos q. The directional derivative is maximal when q 0 which means that the gradient points to the direction in which the function changes the fastest. The magnitude of the gradient is this maximal directional derivative. Likewise ∂ f a 0 when ∇ f a ⊥ v ˆ i.e. f a is perpendicular to the contour line of f at a. Example 9.22 Consider once again the distance function f x x. We saw that ∇ f x x ˆ i.e. the gradient points along the radius vector the direction in which the radius vector changes the fastest and its magnitude is one as ∂ x ˆ h →0 h h 0 h Definition 9.2 Let f R n R. A point a R n at which f a 0 is called a stationary point or a critical point 98 81. What is the meaning of a point being stationary That in any direction the pointwise rate of change of the function is zero. Like for univariate functions stationary points of multivariate functions can be classified: 1. Local maxima: For every v ˆ ∈ R n the function has a local maximum at t 0. Example f x y −x 2 − y 2 .

slide 18:

EKEEDA - ELECTRICAL AND ELECTRONICS ENGINEERING + + + − 2. Local minima: For every v ˆ R n the function g t f x 0 t v ˆ has a local minimum at t 0. Example f x y x 2 y 2 . 3. Saddle points 4 81: A stationary point that’s neither a local mini- mum nor a local maximum. Typically g t f x 0 t v ˆ may have a local minimum a local maximum or an inflection point at t 0 depending on the direction v ˆ. Example f x y x 2 − y 2 . f x y x 2 − y 2 1 0 1 −1 −0.5 0 Example 9.23 Consider the f u n c t i o n 1 0 0.5 1 −1 f x y x 3 + y 3 − 3x − 3y. f x y x 3 + y 3 − 3x − 3y 4 2 0 2 −4 1 −1 0 0 1 −1 −

slide 19:

EKEEDA - ELECTRICAL AND ELECTRONICS ENGINEERING − + − + + + − + + + + + ∶ → → → ∶ → ∇ f x y 3x 2 − 3 Its gradient is hence its stationary points are ±1 ±1. 3y 2 − 3 Take first the point a 1 1 . For v ˆ cos q sin q f a + t v ˆ 1 + t cos q 3 +1 + t sin q 3 − 31 + t cos q − 31 + t sin q . f a t v ˆ 4 3t 2 cos 2 q 3t 2 sin 2 q o t 2 4 3t 2 o t 2 hence in every direction v ˆ f a t v ˆ has a local minimum at t 0. That is a is a local minimum of f . Take next the point b 1 1 . For v ˆ cos q sin q f b + t v ˆ 1 + t cos q 3 +−1 + t sin q 3 − 31 + t cos q − 3−1 + t sin q . f b + t v ˆ 3t 2 cos 2 q − 3t 2 sin 2 q + ot 2 For q 0 f b t v ˆ has a local minimum at t 0 whereas for q p 2 f b t v ˆ has a local maximum at t 0. Thus b is a saddle point of f . 9.9 Higher derivatives and multivariate Taylor theorem Let f R 2 R. If f is differentiable in a certain domain then it has partial deriva- tives ∂ f and ∂ f ∂x ∂y which both are also functions R 2 R recall the the gradient is a function R 2 R 2 . If the partial derivatives are differentiable then they have their own partial derivatives which we denote by ∂ ∂ f ∂ 2 f ∂ ∂ f ∂ 2 f ∂x ∂x ∂x 2 ∂y ∂x ∂x∂y ∂y ∂y ∂y 2 ∂x ∂y ∂y∂x . More generally for functions f R n R we have an n-by-n matrix of partial second derivatives called the Hessian ∂ 2 f ∂x i ∂x j i j 1 ... n. Expanding we find Expanding we find Expanding we find ∂ ∂ f ∂ 2 f ∂ ∂ f ∂ 2 f

slide 20:

EKEEDA - ELECTRICAL AND ELECTRONICS ENGINEERING → → ∶ R → R 0 0 + + ∂x i h ∂x i lim h 0 ∂x j h ∂x j Each such partial second derivative is a function n which may be differentiated along any direction. A function R n R that can be differentiated infinitely many times along any combination of directions is called smooth 8-. Comment 9.8 This is not a trivial statement. One has to show that ∂ f x + h e ˆ j − ∂ f x ∂ f x + h e ˆ i − ∂ f x Example 9.24 Consider the function f x y yx 3 + xy 4 − 3x − 3y. Then ∂ f 2 4 ∂ f 3 3 and ∂ 2 f ∂x 3x y + y − 3 ∂ 2 f ∂y x + 4y x − 3 ∂ 2 f ∂ 2 f ∂x 2 6xy ∂y∂x 3x 2 + 4y 3 ∂y 2 12y 2 x ∂x∂y 3x 2 + 4y 3 . We may calculate higher derivatives. For example ∂ 3 f ∂x∂y∂x 6x. Suppose we know f 2 and its derivatives at a point x y . What can we say about its values in the vicinity of that point i.e. at a point x 0 Dx y 0 Dy It turns out that the concept of a Taylor polynomial can be generalized to multivariate function. That is the matrix of second derivatives is symmetric. j i . ∂ 2 f i j ∂x ∂x ∂x ∂x ∂ 2 f then n Theorem 9.4 — Clairaut. If f ∶ R → R has continuous second partial derivatives lim h 0 .

slide 21:

EKEEDA - ELECTRICAL AND ELECTRONICS ENGINEERING Theorem 9.5 Let f ∶ R 2 → R be n-times differentiable at x 0 x 0 y 0. Then x lim or using the equivalent notation −x 0 → 0 f x x − x − P f n x 0 x 0 n 0 f x P f nx 0 x + o x − x n 0 . ∶ → Definition 9.3 Let f R 2 R be n-times differentiable at x 0 x 0 y 0 . Then its Taylor polynomial of degree n about the point x 0 is P f nx 0 x f x 0 + ∂x x 0 x − x 0+ ∂y x 0 y − y 0 1 ∂ 2 f 2 ∂ 2 f ∂ 2 f 2 + 2 ∂x 2 x 0 x − x 0 + 2 ∂x∂y x 0 x − x 0y − y 0+ ∂y 2 x 0 y − y 0 + ⋅ ⋅ ⋅+ n ∂ n f x x − x k y − y n −k . k0 k ∂x k ∂y n −x 0 0 0 Example 9.25 Calculate the Taylor polynomial of degree 3 of the function f x y x 3 ln y + xy 4 Then ∂ f 2 4 ∂ f x 3 3 ∂x x y 3x ln y + y ∂y x y y + 4xy ∂ 2 f ∂ 2 f 3x 2 3 ∂ 2 f x 3 2 ∂x 2 x y 6x ln y and ∂x∂y x y y + 4y ∂y 2 x y − y 2 + 12xy ∂ 3 f ∂ 3 f 6x ∂ 3 f 3x 2 2 ∂ 3 f 2x 3 ∂x 3 x y 6 ln y At the point 1 1 ∂x 2 ∂y x y y ∂x∂y 2 x y − y 2 +12y 1 ∂y 3 x y 2 y 3 +24xy. P f 3111 + Dx 1 + Dy 1 +Dx + 5 Dy+ 2 14 Dx Dy + 11 Dy + 6 18 Dx Dy + 27 Dx Dy + 26 Dy . 2 3 2 1 n ∂ f ∂ f

slide 22:

EKEEDA - ELECTRICAL AND ELECTRONICS ENGINEERING

slide 23:

EKEEDA - ELECTRICAL AND ELECTRONICS ENGINEERING ∶ → ∂ 2 f 2 2 − MDx ≠ ≠ ∂ 2 f √ 2 ∂x 2 ∂x∂y ∂y 2 Classification of stationary points Taylor’s theorem for functions of two variables can be used to classify stationary points. If a is a stationary point of f R 2 R and f is twice differentiable at that point then by Taylor’s theorem f a + Dx f a + 1 ∂ 2 f aDx 2 + 2 ∂ 2 f aDx Dy + aDy + oDx Near stationary points f f a is dominated by the quadratic terms unless they vanish in which case we have to look at higher-order terms. To shorten notations let’s write i.e. a C f a + Dx f a + 2 A D x + 2 B D xD y + C D y +oDx 2 . Consider the quadratic term MDx: 1. If it is positive for all Dx 0 then a is a local minimum. 2. If it is negative for all Dx 0 then a is a local maximum. 3. If it changes sign for different Dx then a is a saddle point. We now characterize under what conditions on A BC each case occurs: 1. MDx 0 for all Dx ≠ 0 if AC 0 i.e. if ∂ 2 f ∂x 2 a This is still not sufficient. Writing ∂y 2 a 0. √ B 2 B 2 2 Mx we obtain that AC B 2 i.e. ADx + A D y + C − A Dy ∂ 2 f ∂ 2 f ∂ 2 f 2 ∂x 2 a ∂y 2 a ∂x∂y a. . ∂ 2 f ∂ 2 f ∂ 2 f ∂x 2 a A ∂x∂y 1 a B and ∂y 2 2 2

slide 24:

EKEEDA - ELECTRICAL AND ELECTRONICS ENGINEERING + + ∂x 2 ∂y 2 ∂x 2 ∂y 2 ∂x∂y ∂x 2 ∂y 2 ∂x 2 ∂y 2 ∂x∂y Local maximum ∂ 2 f 0 ∂ 2 f 0 ∂ 2 f ⋅ ∂ 2 f ∂ 2 f 2 ∂x 2 ∂y 2 ∂x∂y Saddle ∂ 2 f ⋅ ∂ 2 f ∂ 2 f 2 2. MDx 0 for all Dx ≠ 0 if AC 0 i.e. if ∂ 2 f ∂x 2 a This is still not sufficient. Writing ∂y 2 a 0. 2 2 2 B 2 Mx −ADx −2B Dx Dy+CDy − ADx − A − A −C Dy we obtain once again that AC B 2 i.e. ∂ 2 f ∂ 2 f ∂ 2 f 2 ∂x 2 a ∂y 2 a ∂x∂y a . 3. A saddle point occurs if ∂ 2 f ∂ 2 f ∂ 2 f 2 ∂x 2 a ∂y 2 a ∂x∂y a . We can see it by setting Dx 1 in which case M 1 Dy A 2B Dy C Dy 2 which changes sign if the discriminant if positive. To conclude: Type Conditions Example 9.26 Analyze the stationary points ±1 √ 3 0 of the function f x y x 3 − x − y 2 . Local minimum ∂ 2 f 0 ∂ 2 f 0 ∂ 2 f ⋅ ∂ 2 f ∂ 2 f 2 2 Dy ∂ 2 f B

slide 25:

EKEEDA - ELECTRICAL AND ELECTRONICS ENGINEERING ⊂ ∈ ∀ ∈ ⊂ ∈ R ∈ ∈ − ∈ − ∈ Definition 9.4 A domain D R n is called bounded if there exists a number R such that x D x R. Comment 9.9 In other words a domain is bounded if it can be enclosed in a ball. Definition 9.5 A domain D R n is called open if any point x D has a neigh- borhood y R n y x d contained in D. It is called closed if its complement is open i.e. if any point x D has a neighborhood y R n y x d disjoint from in D. This means that if x n is a point that has the property that every ball around x intersects D then x D. In this section we discuss extremal points of continuous functions defined on a bounded and closed domain D R n . Such domains are called compact 85/8. Why are closedness and boundedness important Recall that every for univariate function a function continuous may fail to have extrema if its domain is not closed or unbounded. Like for univariate functions we have the following results: Example 9.27 Consider the function f x y x 4 + y 4 − 2x 2 + 4xy − 2y 2 in the domain D x y − 2 ≤ x y ≤ 2. Furthermore if f assumes a local extremum at an internal point of D and f is differentiable at that point then its gradient at that point vanishes. . b ≤ f x a ≤ f f x ∈ D ∀ in D both minima and maxima i.e. where exist a b ∈ D such that Theorem 9.6 Let f ∶ D → R be continuous with D ⊂ R n compact. Then f assumes

slide 26:

EKEEDA - ELECTRICAL AND ELECTRONICS ENGINEERING − − ∇ f x y 4x 3 − 4x + 4 y 4 0 4 20 f x y x 4 + y 4 − 2x 2 + 4xy − 2y 2 20 0 −2 −1 0 2 0 1 2 −2 We start by looking for stationary points inside the domain. The gradient is which vanishes for 4y 3 + 4x − 4y i.e. y x − x 3 and x y − y 3 y y − y 3 −y − √ y 3 3 from gives y 3 y 3 1 − y 2 3 i.e. y 0 or y ± 2 i.e. the stationary points are a 0 0 b √ 2 √ 2 and c √ 2 √ 2 . To determine the types of the stationary points we calculate the Hessian ∂ 2 f 2 ∂ 2 f ∂ 2 f 2 I.e. ∂x 2 x y 12x − 4 ∂x∂y x y 4 and ∂y 2 x y 12y − 4. Ha 0 4 Hb Hc 20 4 from which we deduce that a is a saddle point and b and c are local minima with f b f c −8. We now turn to consider the boundaries which comprise of four segments. Since f is symmetric f x y f y x f −x −y

slide 27:

EKEEDA - ELECTRICAL AND ELECTRONICS ENGINEERING − ≈ − − + − − − ∈ ∶ → ∶ → ⋅ ⋅ ⋅ ∈ ⋅⋅⋅ f 2 2 f −2 −2 32 and f 2 √ −2 √ f −2 2 √ 0. √ it suffices to check one segment say the segment −2×−2 2 where f takes the y 16 y 4 8 8y 2y 2 . This is a univariate function whose derivative is y 4y 3 8 4y. The derivative vanishes at y 1.5214. The second derivative is y 12y 2 2. Since it is positive at that point the point is a local minimum. The function at that point equals approximately 20.9. Finally we need to verify the values of f at the four corners: To conclude f assumes its minimal value −8 at − 2 2 and 2 − 2 and its maximal value 32 at 2 2 and −2 −2. Extrema in the presence of constraints Problem statement The general problem: Let f R n R and let also g 1 g 2... g m R n R. We are looking for a point x R n that minimizes or maximizes f under the constraint that g 1 x g 2 x g m x 0. That is the set of interest is x R n g 1 x g 2 x g m x 0 . Without the constraints we know that the extremal point must be a critical point of x. In the presence of constraints it may be that the extremal point is not stationary since the directions in which it changes violate the constraints. Example 9.28 Find the extremal points of f x y 2x 2 + 4y 2 under the constraint gx y x 2 + y 2 − 1 0. Note that f by itself does not have a maximum in R n since it grows unbounded as x → ∞. However under the constraint that x 1 it has a maximum. form

slide 28:

EKEEDA - ELECTRICAL AND ELECTRONICS ENGINEERING ∶ → − ∈ ∈ − − − ∇ ∇ A single constraint Let’s start with a single constraint. We are looking for extremal points of f R n R under the constraint g x 0. One approach could be the following: since the constraint is of the form gx 1... x n 0 invert it to get a relation x n hx 1... x n −1 substitute it in f to get a new function F x 1... x n 1 f x 1... x n 1 h x 1... x n 1 which we then minimize without having to deal with constraints. The problem with this approach is that it is often impractical to invert the constraint. Moreover the constraint may fail to define an implicit function like in Example 9.28 where the constraint is an ellipse. The constrained optimization problem can also be approached as follows: the set D x R n g x 0 over which we extremize f is in general an n 1 -dimensional hyper-surface a curve in Example 9.28. The gradient of g in D points to the direction in which g changes the fastest. The plane perpendicular to this direction spans the directions along which g remains constant. f has an extremal point in D if the gradient of f is perpendicular to the hyperplane along which g remains constant. In other words f has an extremal point at x D if the gradient of f is parallel to the gradient of g i.e. there exists a scalar l such that f x l g x . A equivalent explanation is the following: let x t be a general path in R n satisfying x 0 x 0 . The paths along which g does not change satisfy gxt 0. For all such path g ○ x ′ 0 ∇gxt ⋅ x ′ 0 0. x 0 is a solution to our problem it for all such paths f xt has an extremum at t 0

slide 29:

EKEEDA - ELECTRICAL AND ELECTRONICS ENGINEERING − ○ ∇ ⋅ ∇ ∇ ∇ ∇ 8y − 2ly ± ± ○ ∇ ⋅ i.e. f x ′ 0 f x 0 x ′ 0 0. This condition holds if f x 0 and g x 0 are parallel. Comment 9.10 Note that f x 0 and g x 0 are parallel if and only if there exists a number l such that x 0 is a stationary point of the function F x f x l g x . In this context the number l is called a Lagrange multiplier 19- -5. Example 9.29 Let’s return to Example 9.28. Then x x y is an extremal point if there exists a constant l such that x is a critical point of Fx y 2x 2 + 4y 2 − l x 2 + y 2 − 1 Note ∇Fx y 4x − 2lx i.e. either x 0 in which case y 1 and l 4 or y 0 in which case x 1 and l 2. To find which point is a minimum/maximum we have to substitute in f . Multiple constraints For simplicity let’s assume that there are just two constraints D x ∈ R n g 1x g 2x 0. let x t be a general path in R n satisfying x 0 x 0 . The paths along which g 1 and g 2 do not change satisfy g 1xt g 2xt 0. For all such path g 1 x ′ 0 g 1 x t x ′ 0 0 g 2 ○ x ′ 0 ∇g 2xt ⋅ x ′ 0 0. x 0 is a solution to our problem it for all such paths f xt has an extremum at t 0 i.e. f ○ x ′ 0 ∇ f x 0 ⋅ x ′ 0 0.

slide 30:

EKEEDA - ELECTRICAL AND ELECTRONICS ENGINEERING − − + + − 6z − l − 2µz x − y This condition holds if f x 0 is any linear combination of g 1 x 0 and g 1 x 0 i.e. there exists two constants l and µ such that ∇ f x 0 l ∇g 1x 0 + µ ∇g 2x 0 or equivalently x 0 is a stationary point of the function F x f x l g 1 x l g 2 x . Example 9.30 Find a minimum point of f x y z x 2 + 2y 2 + 3z 2 under the constraints For the function g 1 x y z x y z 1 0 g 2x y z x 2 + y 2 + z 2 − 1 0. Then Fx y z x 2 + 2y 2 + 3z 2 − l x + y + z − 1 − µ x 2 + y 2 + z 2 − 1 . 2x − l − 2µx ∇Fx y z 4y − l − 2µy . The condition that x be a stationary point of F imposes three conditions on five unknowns. The constrains provide the two missing conditions. We may first get rid of z 1 − x − y i.e. 21 − µx l 22 − µy l 23 − µ1 − x − y l and We can then get rid of l x 2 + y 2 +1 − x − y 2 1. 1− µx 2− µy 3− µ1−x−y 1− µx and x 2 +y 2 +1−x−y 2 1. µ x − 2y and remain with two equation for two unknowns. Finally we may eliminate µ

authorStream Live Help