9

Let $e_1,...,e_n$ denote the canonical basis vectors in $\mathbb{R}^n$. Consider the set $$T = \left \{ \frac{e_k}{\sqrt{1 + \log k}}, k =1,...,n\right \}$$ Show that $$\int_{0}^\infty \sqrt{\log \mathcal{N}(T,d,\epsilon)} d\epsilon \rightarrow\infty$$ as $n \rightarrow \infty$.

Here $\mathcal{N}(T,d,\epsilon)$ is the size of the smallest $\epsilon$-net of $T$ with respect to the metric $d$.

An $\epsilon$-net in this context is a set $N \subset T$ such that for all $t \in T$ there is an $s \in N$ such that $d(s,t) \leq \epsilon$.

I assume that $d(x,y) = ||x-y||_2$, although this isn't explicitly stated. Also note,that $\epsilon$-nets must be subsets of $T$

I also write $\log$ when really I mean $\ln$.


I've spent quite a bit of time working on this exercise and from what I can tell, unless I've missed the trick completely, is that this example actually does not work. Here is my argument.

For convenience, let $v_k = \frac{e_k}{\sqrt{1+\log k}}$.

Let $f_n(k):\{1,...,n-1\} \rightarrow \mathbb{R}_+$ be defined as $$f_n(k) = \sqrt{\frac{1}{1+\log k} + \frac{1}{1+\log n}}$$ It is easy to see that for $1 \leq j < k \leq n$ we have $$||v_j -v_k||_2 = \sqrt{\frac{1}{1+\log j}+\frac{1}{1+\log k}} \geq \sqrt{\frac{1}{1+\log j}+\frac{1}{1+\log n}} = ||v_j - v_n||_2 =f_n(j)$$ Now let $1 \leq k \leq n-2$ and let $\epsilon \in [f_n(k+1),f_n(k))$. Let $N$ be an $\epsilon$-net of $T$. It must be that for all $j \leq k$ that $v_j \in N$ since the nearest point $$\min_{i \neq j} ||v_j - v_i||_2 = ||v_j - v_n|| = f_n(j) \geq f_n(k) > \epsilon.$$ Therefore $|N| \geq k+1$ since $\{v_1,...,v_k\} \subset N$ and we need at least one more point for $v_n$. Also note that $N = \{v_1,...,v_k,v_n\}$ is an $\epsilon$-net since for all $j \geq k+1$ we have $$||v_j - v_n||_2 = f_n(j) \leq f_n(k+1) \leq \epsilon.$$ This and the lower bound shows that $N$ is best possible and therefore that $\mathcal{N}(T,d,\epsilon) = k+1$.

Also note that for $\epsilon < f_n(n-1)$ the only possible $\epsilon$-net is $T$ and therefore $\mathcal{N}(T,d,\epsilon) = |T| = n$ for $\epsilon < f_n(n-1)$. Similarly for $\epsilon \geq f_n(1)$ the set $N = v_n$ is an $\epsilon$-net.

We next put these bounds on $\mathcal{N}(T,d,\epsilon)$ together. First we can use what happens when $\epsilon \geq f_n(1)$. \begin{align} \int_0^\infty \sqrt{\log \mathcal{N}(T,d,\epsilon)}d\epsilon &= \int_0^{f_n(1)} \sqrt{\log \mathcal{N}(T,d,\epsilon)}d\epsilon + \int_{f_n(1)}^\infty \sqrt{\log \mathcal{N}(T,d,\epsilon)}d\epsilon \\ &= \int_0^{f_n(1)} \sqrt{\log \mathcal{N}(T,d,\epsilon)}d\epsilon + \int_{f_n(1)}^\infty \sqrt{\log 1}d\epsilon \\ &= \int_0^{f_n(1)} \sqrt{\log \mathcal{N}(T,d,\epsilon)}d\epsilon \end{align} Next we split the integral up \begin{align} \int_0^{f_n(1)} \sqrt{\log \mathcal{N}(T,d,\epsilon)}d\epsilon &= \int_0^{f_n(n-1)} \sqrt{\log \mathcal{N}(T,d,\epsilon)}d\epsilon + \sum_{i=1}^{n-2} \int_{f_n(i+1)}^{f_n(i)} \sqrt{\log \mathcal{N}(T,d,\epsilon)}d\epsilon \\ &= \int_0^{f_n(n-1)} \sqrt{\log(n)}d\epsilon + \sum_{i=1}^{n-2} \int_{f_n(i+1)}^{f_n(i)} \sqrt{\log(i+1)}d\epsilon \\ &= f_n(n-1)\sqrt{\log n} + \sum_{i=1}^{n-2} [f_n(i) - f_n(i+1)]\sqrt{\log(i+1)} \end{align} So far this is all an exact equality. If we substitute in the definition of $f_n$ into this expression we get $$\sqrt{\frac{\log n}{1+\log (n-1)} + \frac{\log n}{1+\log n}} + \sum_{i=1}^{n-2}\sqrt{\frac{\log(i+1)}{1+\log i } + \frac{\log(i+1)}{1+\log n}}-\sqrt{\frac{\log(i+1)}{1+\log(i+1)} + \frac{\log(i+1)}{1+\log n}}$$ Numerically however this seems to converge to less than 3. So either the series grows incredibly slowly or it is failing to grow to infinity.


Any help would be greatly appreciated! Also, if you try replacing $||\cdot||_2$ with $||\cdot||_1$ it doesn't appear to help

user135520
  • 2,187
  • 1
    Jesus, that last thing u get looks painful to estimate (i tried for a bit). It is possible it grows like $\log \log n$. It takes a long time for $\log\log n$ to exceed $3$! – mathworker21 Apr 21 '22 at 01:36

3 Answers3

3

Using the hint on p.269, (the first $m$ vectors in $T$ form a $ 1/\sqrt{\ln(m)}$-separated set), let $v_m:= 1/\sqrt{\ln(em)}$, and $u_n:=\ln(n)/\ln(en)$. Then, assuming that $n\ge 3$, \begin{align} \int_0^{\infty}\sqrt{\ln \mathcal{N}(T,d,\varepsilon)}\,d\varepsilon&\ge \int_0^{v_{n}}\sqrt{\ln n}\,d\varepsilon +\sum_{i=1}^{n-2} \int_{v_{n-i+1}}^{v_{n-i}}\sqrt{\ln(n-i)}\,d\varepsilon \\ &=\sqrt{u_n}+\sum_{i=2}^{n-1}\left(\sqrt{\frac{\ln(i)}{\ln(ei)}}-\sqrt{\frac{\ln(i)}{\ln(e(i+1))}}\right) \\ &\ge \sqrt{u_n}+\int_2^{n-1}\left(\sqrt{\frac{\ln(x)}{\ln(ex)}}-\sqrt{\frac{\ln(x)}{\ln(e(x+1))}}\right)dx \\ &\ge \sqrt{u_n}+\int_2^{n-1}\frac{1}{e(x+1)\ln(e(x+1))}\,dx \\ &\ge e^{-1}\ln(\ln(n))\to\infty \end{align} as $n\to\infty$.

  • 1
    Consider for example $m = 10, n=10,000$ then the covering number for $\epsilon = 1/\sqrt{\log 10}$ of ${v_1,...,v_{10}}$ is $10$, but for ${v_1,...,v_{10},v_{10,000}}$ it is only 8. I believe this breaks your argument. – Matt Werenski Apr 21 '22 at 20:18
  • What are $v_1,\ldots,v_{10000}$? –  Apr 21 '22 at 20:25
  • 1
    In the original post they are defined by $v_i = \frac{e_i}{\sqrt{1 + \log i}}$ where $e_i$ is the $i$'th standard basis vector of $\mathbb{R}^n$. – Matt Werenski Apr 21 '22 at 20:34
  • You're right. I missed the "1+" term. It becomes even worse as $n$ grows. –  Apr 21 '22 at 21:47
2

After a long think I've also come to what I feel is the cleanest solution to this problem so far.

It starts with Lemma 4.2.8, which stats for any set $T$ that $$\mathcal{P}(T,2\varepsilon) \leq \mathcal{N}(T,\varepsilon) \leq \mathcal{P}(T,\varepsilon).$$ The lower bound is the one of interest. Plugging this into Dudley's integral gives \begin{align} \int_0^\infty \sqrt{\log(\mathcal{N}(T,\varepsilon))} d\varepsilon &\geq \int_0^\infty \sqrt{\log(\mathcal{P}(T,2\varepsilon))} d\varepsilon \\ &= \frac{1}{2}\int_0^\infty \sqrt{\log(\mathcal{P}(T,\varepsilon))} d\varepsilon \end{align} Now we can lower bound $\mathcal{P}(T,\varepsilon)$ in the following way. Recall the definition $$v_k = \frac{e_k}{\sqrt{1+\log(k)}}$$ and therefore for $k > j \geq 1$ we have \begin{align} \|v_j - v_k\|_2 &= \sqrt{\frac{1}{1+\log(j)} + \frac{1}{1+\log(k)}} \\ &> \sqrt{\frac{1}{1+\log(k)} + \frac{1}{1+\log(k)}} &(k > j)\\ &= \sqrt{\frac{2}{1+\log(k)}} \end{align} Now set this equal to $\varepsilon$ and solve \begin{align} \varepsilon = \sqrt{\frac{2}{1+\log(k)}} \iff k= \exp\left ( \frac{2}{\varepsilon^2} -1\right ) \end{align} In particular this shows that with $k(\varepsilon) = \left \lfloor \exp\left ( \frac{2}{\varepsilon^2} -1\right )\right \rfloor$ that $v_1,...,v_{k(\varepsilon)}$ are $\epsilon$ separated. This establishes \begin{equation} \mathcal{P}(T,\varepsilon) \geq k(\varepsilon) = \left \lfloor \exp\left ( \frac{2}{\varepsilon^2} -1\right )\right \rfloor \end{equation} To clean up some of the calculations I will use the bound, valid for all $\varepsilon \leq 1/2$ $$\left \lfloor \exp\left ( \frac{2}{\varepsilon^2} -1\right )\right \rfloor \geq \exp\left( \frac{1}{\varepsilon^2} \right).$$ I'll prove this at the end. Using the bounds we have shown above we have \begin{align} \int_0^\infty \sqrt{\log(\mathcal{P}(T,\varepsilon))} d\varepsilon &\geq \int_0^{1/2} \sqrt{\log(\mathcal{P}(T,\varepsilon))} d\varepsilon \\ &\geq \int_0^{1/2} \sqrt{\log(k(\varepsilon))} d\varepsilon \\ &\geq \int_0^{1/2} \sqrt{\log(e^{1/\varepsilon^2})} d\varepsilon \\ &= \int_0^{1/2} \frac{1}{\varepsilon} d\varepsilon = +\infty. \end{align} which shows the result.


Here's the proof of the inequality. For $\varepsilon \leq 1/2$ we have \begin{align} \left \lfloor \exp\left( \frac{2}{\varepsilon^2} - 1 \right ) \right \rfloor &\geq \left \lfloor \exp\left( \frac{1}{\varepsilon^2} + \frac{1}{(1/2)^2} - 1\right ) \right \rfloor \\ &= \left \lfloor \exp\left( \frac{1}{\varepsilon^2} + 3 \right ) \right \rfloor = \left \lfloor e^{3} \exp\left( \frac{1}{\varepsilon^2} \right ) \right \rfloor \end{align} To conclude note that $\exp\left( \frac{1}{\varepsilon^2} \right ) \geq e^4 > 1$ and $e^3 > 2$ and use the inequality valid for all $a > 1, b > 2$ $$\lfloor ba \rfloor \geq a.$$


I really appreciate the high quality answers from ExcitedMathematician and user140541! Both inspired me to find this solution.

1

Here's an alternative solution

$$\int_{0}^{\infty} \sqrt{\log{N(T,\epsilon)}} = \int_{0}^{diam(T)} \sqrt{\log{N(T,\epsilon)}}$$

Make the change of variables $\epsilon = 2/\sqrt{\log{t}}$ and define $diam(T)=d(T)$. Then

$$\int_{0}^{diam(T)} \sqrt{\log{N(T,\epsilon)}} d\epsilon = \int_{e^{4/d(T)^{2}}}^{\infty} \sqrt{\log{N(\frac{2}{\sqrt{\log{t}}})}} \frac{dt}{t (\log{t})^{3/2}} \geq \int_{e^{4/d(T)^{2}}}^{\infty} \sqrt{\log{P(\frac{1}{\sqrt{\log{t}}})}} \frac{dt}{t (\log{t})^{3/2}}$$

Where $P$ is the packing number. Now, note that it is a decreasing function of it's argument so

$$\int_{0}^{diam(T)} \sqrt{\log{N(T,\epsilon)}} d\epsilon \geq \sum_{m = \lceil e^{4/d(T)^{2}}\rceil}^{\infty} \int_{m}^{m + 1}\frac{\sqrt{\log{m}}}{t (\log{t})^{3/2}}dt \geq \sum_{m = \lceil e^{4/d(T)^{2}}\rceil}^{\infty} \int_{m}^{m + 1}\frac{(\log{m})^{1/2}}{(m+1)(\log{(m+1)})^{3/2}}dt$$

Moreover $$\int_{0}^{diam(T)} \sqrt{\log{N(T,\epsilon)}} d\epsilon \geq \sum_{m = \lceil e^{4/d(T)^{2}}\rceil}^{\infty} \frac{(\log{m})^{1/2}}{(m+1)(\log{(m+1)})^{3/2}} \to \sum_{m \geq 2} \frac{(\log{m})^{1/2}}{(m+1)(\log{(m+1)})^{3/2}} \geq \frac{1}{2} \sum_{m \geq 2}\frac{(\log{m})^{1/2}}{m(\log{(m+1)})^{3/2}}$$

Where the limit holds. Now we apply a Cauchy's condensation test to show it diverges

$$\frac{1}{2}\sum_{m \geq 1}\frac{\sqrt{m} \sqrt{\log{2}}}{(\log{(2^{m} +1)})^{3/2}} \geq \frac{1}{2}\sum_{m \geq 1}\frac{\sqrt{m} \sqrt{\log{2}}}{(\log{(2^{2m})})^{3/2}} = \frac{1}{2}\sum_{m \geq 1}\frac{\sqrt{m} \sqrt{\log{2}}}{(2)^{3/2}m^{3/2}} \propto \sum_{m\geq 1}\frac{1}{m}$$

This is a multiple of the harmonic series so it diverges! (I believe it should probably apply to your last series)

As a result

$$ \int_{0}^{\infty} \sqrt{\log{N(T,\epsilon)}} \to \infty $$