0

I was reading the MSE question "Why is the gradient the direction of steepest ascent", and I came across Jonathan's answer, which is copied here:

Consider a Taylor expansion of this function, $$f({\bf r}+{\bf\delta r})=f({\bf r})+(\nabla f)\cdot{\bf\delta r}+\ldots$$ The linear correction term $(\nabla f)\cdot{\bf\delta r}$ is maximized when ${\bf\delta r}$ is in the direction of $\nabla f$.

I understand from here, we would maximize $(\nabla f)\cdot{\bf\delta r}$ in the standard way, using the Cauchy'Schwarz Inequality. However, I don't understand the first part of the argument. Why would maximizing the linear term maximize the directional derivative? The only justification is a heuristic one, along the lines of "if we want to maximize the directional derivative, we want to maximize $f({\bf r}+{\bf\delta r})$ and since the linear term has the most impact than all the others, so we should focus on minimizing it." However, this doesn't really make sense to me; yes the linear term has the biggest impact, when compared to all the other terms individually. But it's not clear at all that it has the biggest impact when compared to the sum of all of the other terms.

Ovi
  • 24,817
  • The directional derivative is the linear term, so if you want to maximize the directional derivative, you have to maximize the linear term (almost by definition). – peek-a-boo May 10 '20 at 16:15
  • @peek-a-boo Thanks for your answer. I know, independently from the Taylor series, that the directional derivative is $v^T \nabla f(x)$. However, if we already knew this, we wouldn't need to mention the Taylor expansion at all. So since Jonathan mentioned it, it seems like there is a way to derive that $v^T \nabla f(x)$ is the directional derivative purely from the Taylor expansion. – Ovi May 10 '20 at 16:22
  • Exactly, there's no need to mention Taylor's theorem at all. Especially if you consider a 1st order Taylor expansion, then Taylor's theorem is completely superfluous, because it is literally the definition of differentiability. Whenever people say "consider a first order taylor expansion/linear Taylor expansion", that is literally an application of the definition of differentiability. But anyway, the claim made is "The linear correction term is maximised when..." so the answer there explicitly refers to the linear term anyway, so I'm not really sure what the issue is. – peek-a-boo May 10 '20 at 16:23
  • @peek-a-boo Sorry if I'm not making much sense, I'm pretty new at this. I guess the question is: "If you know that $f({\bf r}+{\bf\delta r})=f({\bf r})+(\nabla f)\cdot{\bf\delta r}+\ldots$ and nothing else (in particular, you don't know that $D_vf(x) = v^T \nabla f(x))$, can you deduce that the direction of steepest ascent is $\nabla f(x)$?" Jonathan's post implies that the answer is "yes", but I don't know how to deduce it. – Ovi May 10 '20 at 16:36
  • To answer that, you would have to define precisely what you mean by "direction of steepest ascent". My definition of that is to find a unit vector $v$ such that $(D_vf)(x)$ is a maximum value. And then to link everything together, I would have to prove $(D_vf)(x) = \nabla f(x) \cdot v$; once you establish this simple equality which relates the directional derivative to the (total) derivative, everything becomes almost trivial. I'd use Cauchy-Schwarz inequality to complete the rest. – peek-a-boo May 10 '20 at 16:47
  • @peek-a-boo Thanks. So I see that the Taylor series really is superfluous here. – Ovi May 10 '20 at 16:49

0 Answers0