2

Consider a function $f(x, y): \mathbb{R}^2 \rightarrow \mathbb{R}$ where $f(x, y) = \frac{5}{2}x^2 + xy + \frac{5}{2}y^2$.

By definition the gradient of a function $f$ is Lipschitz continuous with parameter $\beta > 0$ if:

\begin{align} \tag 1 \|\nabla f(x) - \nabla f(y)\| \leq \beta\|x- y\| \end{align}

Here $x$ and $y$ denote any two points in the domain of function $f$. The function $f$ with this property is also $\beta$-smooth. Here the norm $\| \|$ denotes a euclidean norm.

I tried looking at this post for hints on how to find $\beta$ but it didn't give me any insights to apply them for my function in question.

Here is what I have tried so far: \begin{align} \nabla f &= \begin{bmatrix} 5x+ y \\ x + 5y \end{bmatrix} \end{align} Now lets consider two points $a = (a_1, a_2)$ and $b = (b_1, b_2)$ for our function $f$. Now the left hand-side of equation ($1$) without the norm becomes \begin{align} \nabla f(a) - \nabla f(b) &= \begin{bmatrix} 5a_1 + a_2 - 5b_1 - b_2\\ a_1 + 5a_2 - b_1 - 5b_2 \end{bmatrix} \end{align} Applying the euclidean norms of equation (1) leaving $\beta$ alone on the right hand side we have:

\begin{align} \frac{\|\nabla f(a) - \nabla f(b)\|}{\|a- b\|} = \frac{\sqrt{(5a_1 + a_2 - 5b_1 - b_2)^2 + (a_1 + 5a_2 - b_1 - 5b_2)^2}}{\sqrt{(a_1 - b_1)^2 + (a_2- b_2)^2}} \leq \beta \end{align}

So the smallest $\beta$ value will occur when we have equality. I tried expanding the above expression but it only gets more messy. I think my approach is wrong. What form would $\beta$ be? Is it a constant? From the above expression it seems to depend on the coordinates of the selected two points but according to the definition it should be independent of the chosen points.

What would be the right way to calculate $\beta$? Any help building up the intuition with this simple example would be greatly appreciated!

Thanks!

1 Answers1

1

Hint: the derivative $\nabla f$ is a linear map; indeed, the matrix for the map is $A = \begin{pmatrix} 5 & 1 \\ 1 & 5 \end{pmatrix}$. Thus $$\| \nabla f(x) - \nabla f(y)\| = \| A(x-y)\| \le \|A\| \| x-y\|.$$ Now you just need to bound (or explicitly find) the matrix norm of $A$.

User8128
  • 15,835
  • Thanks! @User8128. So with a quick search on wikipedia https://en.wikipedia.org/wiki/Matrix_norm#Special_cases the matrix euclidean norm of $A$ is the maximum singular value. $|A|_{2}$ in this case happens to be $6$. I was wondering what would be the right norm in the expression as it would affect the bound. Is it always euclidean? – user7407311 Dec 04 '19 at 03:42
  • Also does the inequality above provide the tightest bound? i.e. does solving for $|A|_{2}$ guarantee the best/smallest value of $\beta$ – user7407311 Dec 04 '19 at 03:57
  • The matrix norm is essentially defined to be the smallest (i.e. best) $\beta > 0$ such that $| A x| \le \beta |x |$ for all $x$, so yes, this gives the best $\beta$. If you are considering the Euclidean norm on $\mathbb R^2$, then here $| A |$ denotes the matrix $2$-norm. If you are considering some other norm, that will change, but it seems you are considering the Euclidean norm here. – User8128 Dec 04 '19 at 04:54