4

Forgive my use of many words. I would like to understand the meaning of direction of steepest ascent.

I am undertaking a course on multivariate calculus on coursera. The instructor mentioned that the Jacobian points to the direction of steepest ascent. However, my intuition of steepest ascent is lacking.

Consider a sphere given by the equation $x^2 + y^2 + z ^2 = 13^2$. The equivalent function is given by $z=f(x, y) = \sqrt{13^2 - x^2 - y^2} $. The point (3, 4, 12) lies on the sphere. Suppose we want to find the direction of steepest ascent at this point, we begin by drawing tangents in all directions at this point. Assume there's an imaginary plane z = 100 above the point (3, 4, 12), we can extend each of these tangents from the point (3, 4, 12) to meet the plane z = 100. It's clear that each of these tangents will have different lengths from the given point ( 3, 4, 12) to the point each of them meets the plane z = 100. My understanding is that the tangent with the shortest length gives the direction of steepest ascent. Is my intuitive understanding of direction of steepest ascent correct and complete? Also a reference where I can read more on the topic would be of great help.

ASP
  • 254
  • 2
    No, $z$ is not special. It doesn't represent "height", and doesn't have any connection to "steepest" that $x$ and $y$ don't have. It would if your function was $f(x,y)$, and you defined $z=f(x,y)$. Then "steepest" would mean the direction in the plane such that an infinitesimal movement in that direction increases $z$ the most. But if your function is $f(x,y,z)$, then "steepest" means a direction 3D space such that an infinitesimal movement in that direction increases the function's value the most. – Joe Jul 15 '21 at 11:18
  • 1
    By giving an equation rather than an explicit function for us to find the Jacobian of, you have already gotten two answers that assume the function is $f(x,y,z)=x^2+y^2+z^2-13^2$, whereas your visualization of the steepest ascent at $(3,4,12)$ is consistent with the function $z = f(x,y) = \sqrt{13 - x^2+y^2}.$ The Jacobian is defined on a function, not an equation, so you must always be clear about what function you have in mind. – David K Jul 15 '21 at 11:25
  • 1
    To elaborate on @DavidK's comment: it is a function that has a direction of steepest ascent. You haven't specified a function in your question; you have specified a surface. If you think in terms of functions, then the direction of steepest ascent is just the direction in which the function increases the fastest. (In the first of DavidK's example functions, the direction of steepest ascent is the outward normal to the sphere; in the second function, it is the inward normal to the sphere.) – TonyK Jul 15 '21 at 11:27
  • The function is $z=f(x, y) = \sqrt{13^2 - x^2 - y^2} $ – ASP Jul 15 '21 at 11:39
  • @DavidJones, please make important updates like by editing the post, rather than in the comments. – Joe Jul 15 '21 at 11:42
  • @DavidK, I don't think his visualization was even consistent with $z=f(x,y)$, because I think he was visualizing directions in space, whereas the direction of steepest ascent for $z=f(x,y)$ would be a direction in the $xy$-plane. I think that's a very common point of confusion when first learning about gradients. – Joe Jul 15 '21 at 11:50
  • The instructor jumped from univariate calculus to mentioning that the Jacobian gives the direction of steepest ascent. He never really discussed what he meant by steepest ascent in the first place. I understand that the Jacobian of a function is simply a vector whose components are the partial derivatives of the function with respect to each of the independent variables. But I don't get why the Jacobian points in the direction of steepest ascent because I don't understand what steepest ascent means. Perhaps a reference where I can read more on the topic would be of most help to me. – ASP Jul 15 '21 at 12:01
  • 2
    I think your understanding of steepest ascent is close to correct. But you need to project the tangents into the $xy$-plane, since the direction of steepest ascent is in the $xy$-plane. Think of standing on the side of mountain. The direction of latitude and longitude that you would walk one small step in to increase your elevation the most (compared to steps of the same size in other directions) is the direction of steepest ascent. BUT, that direction doesn't point UP the mountain, because that direction is in the $xy$-plane; it doesn't have a $z$ component. – Joe Jul 15 '21 at 12:07
  • 1
    Ohh. Finally I understand. The direction is in the xy plane. Its more of asking what combination of x and y produces the largest increase in the function f(x, y) at a given point. – ASP Jul 16 '21 at 05:36

2 Answers2

2

"Steep" is with respect to the evaluation of a function, not just its literal height in $3D$ space, since climbing upwards may reduce the output of a function sometimes. If you plotted the output of the function in a different graph, and separated input from output, perhaps that would clarify things - gradient seeks to increase the output the most, which is different from increasing the input.

It is more the case that, in calculus, we look at "infinitesimal" steps (although the theory of infinitesimals is a bit different, this is a good euphemism). For your point on the sphere, we ask: "how sensitive is the function (or the sphere!) to small changes in a given direction, from this point?". And the answer is that the gradient, or the Jacobian in higher dimensions, shows the direction in which the function changes most rapidly. Do not think of long tangent lines out to $z=100$, because calculus examines the limits of very small changes. Here the gradient just says that a "tiny" step here increases my function the most. In other functions, it is the case that as you walk further along this direction, so no longer limiting steps, you're in completely different territory, and perhaps even decreasing now! So extending tangent lines will not always provide intuition with more complex surfaces.

Also note that a direction perpendicular to the Jacobian, gradient, is a contour: no change is exhibited, and we stay on the sphere as we walk in that perpendicular direction.

The sphere is also the set of all points where some function $f(x,y,z)=x^2+y^2+z^2-13^2$ equals zero... so a step in our gradient will increase the function the most, but not necessarily remain on the sphere, since we will be increasing from $0$ and start stepping off.

$$\nabla f=\begin{pmatrix}2x\\2y\\2z\end{pmatrix}$$

And so at any $x,y,z$, on your sphere, a step in this direction will show you where $f$ is most sensitive (in the increasing sense) to change - in the very very small steps! The intuition for why this is the case might be the observation that if we increase $x,y,z$ all in equal amounts, the function $f$ will increase the most since all it is doing is squaring $x,y,z$.

A step in the direction:

$$\begin{pmatrix}-2y\\2x\\0\end{pmatrix}$$

Will show you how to walk whilst still being on the sphere - the evaluation of $f$ shouldn't change. Note that there are more than one perpendiculars here - I give only one example.

FShrike
  • 46,840
  • 3
  • 35
  • 94
  • While there's certainly an argument in favor of understanding the graph of a function without embedding it, I think your answer is a bit misleading because it presents the height $z=100$ as contrasting with the tiny changes the gradient is set up to consider. But any height above the point would work to tell you the direction of steepest (local) ascent, so $z=100$ is totally fine in that sense. – Mark S. Jul 15 '21 at 11:24
2

Let $f(x, y, z) = x^2 + y^2 + z^2 - 13^2$

The gradient vector for the sphere is given by

$ \nabla f = [ 2 x, 2 y , 2 z ]^T $

So by using linearization of $f $ about the point $(x,y,z) = (3, 4, 12)$

$f(3 + dx , 4 + dy, 12 + dz ) = f(3, 4, 12)+ (2x) dx + (2y) dy + (2z) dz = 0+ 6 dx + 8 dy + 24 dz = 0 $

Where equating to $0$ in the last equation comes from the fact that we're assuming that we remain on the sphere, i.e. $f(3+dx, 4+dy, 12+dz) = 0$ then

$ 6 dx + 8 dy + 24 dz = 0 $

So that $dz = \dfrac{1}{12}(-3 dx - 4 dy) $

if the differential vector in the $(x, y)$ plane is $u = (dx, dy)$ , then $dz$ will be maximum (i.e. maximum ascent) if $(dx, dy)$ are along the vector $(-3, -4)$, so this is the direction of steepest ascent.

I forgot to relate the above analysis to the shortest distance from $(3, 4, 12)$ to the plane $z = 100$. From the above analysis, we know that the vector $u = (dx, dy, dz)$ has to satisfy,

$ 3 dx + 4 dy + 12 dz = 0 $

Setting $dz = 100 - 12 = 88$ reduceds the equation to

$ 3 dx + 4 dy + 12(88) = 0$

or

$ 3 dx + 4 dy + 1056 = 0$

The square of the length of the straight line connecting $(3, 4, 12)$ to the plane $z=100$ is therefore,

$ L = dx^2 + dy^2 + 88^2 $

We want to verify that minimizing L subject to the constraint on (dx, dy) results in the same direction. Using Lagrange method, we have

$\dfrac{\partial L}{\partial dx} = 2 dx + 3 \lambda = 0 $

$\dfrac{\partial L}{\partial dy} = 2 dy + 4 \lambda = 0 $

it follows immediately that (dx, dy) are along the vector $-\lambda (3, 4)$

To find $\lambda$ , we use

$\dfrac{\partial L }{\partial \lambda} = 0 = 3 dx + 4 dy + 1056$

So that $( \dfrac{-9}{2} + (-8) ) \lambda + 1056 = 0 $ which results in a positive $\lambda$, hence we deduce that the shortest distance to the plane $z = 100$ is when (dx, dy) is along the vector $(-3, -4)$, confirming our previous "local" result.

The above implies that your intuition about the concept of steepest ascent is correct.

  • This doesn't seem to address the question about $z=100$. – Mark S. Jul 15 '21 at 11:34
  • This is the classical way of dealing with the steepest ascent problem. –  Jul 15 '21 at 11:42
  • You don't put a plane at $z = 100$ and try to reach it with a straight line. You work locally close to the point of interest. –  Jul 15 '21 at 11:43
  • Updated the question. A reference where I can read more on multivariate calculus would be of most help. – ASP Jul 15 '21 at 12:06
  • @DavidJones, you should add the "reference request" tag to your question. – Joe Jul 15 '21 at 12:58