The usual definition of a local extreme carries over quite naturally to the case of more variables.
Definition.
Let f be a function defined on some neighborhood of a point∈ℝn. We say that f has a local maximum at , or that
f () is a local maximum, if there exists a neighborhoodU = U() such thatf () ≥ f () for all∈U. We say that f has a local minimum at , or that
f () is a local minimum, if there exists a neighborhoodU = U() such thatf () ≤ f () for all∈U.
The picture below for the case of two variables shows two local maxima on the left and a local minimum on the right.
This is how we also imagine these notions for more dimensions. A local maximum has the property that if we slice the graph through that point in any direction (thus passing to the situation of one variable), then we still have a local maximum in the usual meaning on that slice. An analogous property is true for every local minimum.
In more dimensions there is a new kind of behaviour, we see it on the picture between the two hills. If we cut the graph there with a plane in direction leading between the hills, then we see a local maximum on the slice there in the valley. However, if we cut the graph using a perpendicular vertical plane (passing through the two summits), then in the valley we see a local minimum on the slice. Such points are called saddles or saddle points and we encounter them when investigating extrama, so they are usually counted among points to explore when a question asks about extrema.
How do we find those local extrema? The procedure is similar to investigating local extrema for functions of one variable. Roughly speaking, first we find candidates using the first derivative, then we classify them using the second derivative.
If we cut the graph by an arbitrary vertical plane through some local extreme , we also get an extreme on the slice, so the derivative at in that direction must be equal to zero. If all directional derivatives are to be zero, then the gradient at (as a vector) must be zero as well.
Another reasoning: At a local extreme, the tangent plane must be horizontal, so its normal vector must be vertical. We observed in the previous chapter that for its normal vector we can take the vector
This vector is vertical exactly if
Theorem.
Let f be a function defined on some neighborhhood of a point∈ℝn. If f has a local extreme at and the gradient exists there, then∇ f () = .
Points where
As usual the statement does not work in the opposite direction, not every stationary point is a local extreme. Just recall saddle points that are stationary points but not extrema. So when we find stationary points, we need to classify them. For that we use the Sylvester criterion. It is easier to remember its conditions if you can imagine what is actually going on there.
A local maximum can be recognized by the fact that it is a maximum on all
slices, in particular when cutting parallel to axes. In a one-variable
situation we recognize a local maximum easily using the second derivative,
so in case of more dimensions we expect a local maximum to satisfy
We now focus on the case of two variables. All extrema (maxima and minima)
have one thing in common, the signs of
We see that the product of non-mixed second derivatives can serve as a
primary tool for telling apart saddles and extrema. And once we find that
the point in question is an extreme, then to distinguish between a maximum
and a minimum it is enough to check on some slice, that is, we just check on
the sign of arbitrary non-mixed second derivative, for instance
These observations were not completely wrong, but there is an unpleasant gap. We observed that an extreme has a positive product of the two second derivatives, but in fact the other direction is needed. If we find that the product is positive, does it mean that we have a local extreme? Unfortunately not.
The problem lies in the fact that we also have to take into account mixed derivatives, that is, we have to consider all entries of the Hess matrix
Above we used the product of its diagonal to make the first decision,
perhaps it reminded the reader of determinant. It turns out
that it indeed works this way,
1. By solving the equation
we find stationary points .
2. For each stationary point we find the corresponding Hess matrix
3. If
4. If
When zeros appear in key moments, then this algorithm fails, we know nothing and more advanced methods have to be used. That is a topic beyond this introduction.
If we want to generalize this procedure for more variables, we have to look
at it from a different angle. First we notice that in step 4 we are actually
also checking on a sign of some matrix, namely a submatrix of H given by
its upper left corner. This is an interesting inspiration. We imagine a
(large) matrix H and we ask what can be expected from its upper left
subdeterminants of all sizes, these are traditionally denoted
Recall that in case of a local maximum we expect
The first subdeterminant is the upper left entry of H, that is,
It should be negative for maximum, positive for minimum.
The second subdetermiant is given by the
It should be positive for both maximum and minimum.
The third subdeterminant is given by the
It should be negative for maximum, positive for minimum.
You can surely work out how this should go on. For maxima the signs alternate, for minima all subdeterminants come up positive.
If there is some other progression of signs, then we do not have a maximum or minimum, and if some of the determinants are zero, then the whole procedure failed and we do not know what is going on at .
Our observations about a diagonal H are true in general.
Theorem (Sylvester criterion).
Let f be defined and have continuous second order partial derivatives on some neighborhood of a point that is stationary for f, that is,∇ f () = 0. Let H be the Hess matrix of f at , letΔi be its upper left subdeterminants.If
Δi > 0 for all i, thenf () is a local minimum.If
Δ1 < 0, Δ2 > 0, Δ3 < 0, and so on up to(−1) nΔn > 0, thenf () is a local maximum.
1. By solving the equation
we find stationary points .
2. For each stationary point we find the corresponding Hess matrix
3. We evaluate subdeterminants
4. If
If the signs alternate
Example.
We find a classify local extrema of the function
First we find stationary points. The equation
It is a system of three equations of three variables, this sounds hopeful, but the equations are not linear, so the whole nice theory is of no use. How do we solve general systems?
We start by noticing that the third equation is independent of the others,
so definitely
We focus on the second equation that we rewrite as
The case
The case
Now we have to investigate all three stationary points, so we need the Hess matrix. We prepare the second partial derivatives, thanks to the symmetry it is enough to calculate six of them:
The Hess matrix is
Here we go:
Point
Signs go +, +, +, therefore
Point
Signs go +, -, -, therefore
Point
Signs go +, +, -, therefore
Example.
We investigate local extrema of the function
First we find stationary points.
Since the exponential is always positive, we can divide the equations by it
and solve the equations
If
If
We prepare the second partial derivatives:
The Hess matrix is
The term
Since we have a function of two variables, we use the first algorithm where
we first check on
Point
hence
Point
hence
Point
hence
Functions of more variables: Integral
Back to Extra - Functions of more variables