13

Suppose $X,X_1,X_2,X_3\dots$ is a $\mathbb{P}$-i.i.d. family of $[-1,1]$-valued random variables with $\mathbb{E}[X] = 0$.

Hoeffding's inequality implies that \begin{equation*} \forall T \in \mathbb{N}, \forall \varepsilon > 0, \qquad \mathbb{P}\bigg[\frac{1}{T} \sum_{s=1}^T X_s \ge \varepsilon\bigg] \le \exp\Big(-\frac{T}{2}\varepsilon^2\Big), \end{equation*} from which it follows that \begin{equation*} \forall T \in \mathbb{N}, \forall \delta \in (0,1), \qquad \mathbb{P}\bigg[\frac{1}{T} \sum_{s=1}^T X_s \ge \sqrt{\frac{2}{T}\log\Big(\frac{1}{\delta}\Big)}\bigg] \le \delta. \end{equation*}

Basically, this result states that the trajectory of the process $Y_t := \frac{1}{t}\sum_{s=1}^t X_s$ is below the threshold $\sqrt{\frac{2}{T}\log\Big(\frac{1}{\delta}\Big)}$ with probability at least $1-\delta$, at any prescribed time $T$.

Instead, suppose that we wanted the whole trajectory $Y_1, Y_2,\dots,Y_T$ under the thresholds $\sqrt{\frac{2}{1}\log\Big(\frac{1}{\delta}\Big)}, \sqrt{\frac{2}{2}\log\Big(\frac{1}{\delta}\Big)}, \dots, \sqrt{\frac{2}{T}\log\Big(\frac{1}{\delta}\Big)}$, where $T \in \mathbb{N}$ is fixed. Can we do any better than using the following union bound

\begin{equation*} \mathbb{P}\bigg[\bigcup_{t\in\{1,\dots,T\}} \bigg\{\frac{1}{t} \sum_{s=1}^t X_s \ge \sqrt{\frac{2}{t}\log\Big(\frac{1}{\delta}\Big)}\bigg\}\bigg] \le \sum_{t=1}^T \mathbb{P}\bigg[\frac{1}{t} \sum_{s=1}^t X_s \ge \sqrt{\frac{2}{t}\log\Big(\frac{1}{\delta}\Big)}\bigg] \le \delta\cdot T ? \end{equation*}

The problem with this last inequality is that a multiplicative factor $T$ appears in the bound, and I feel that maybe, due to something in the spirit of maximal martingale inequalities, we can do much better.

Any suggestion or pointer to the literature is very welcome.

EDIT: I tried to perform some simulations. Taking $X_1,X_2,\dots$ as a $\mathbb{P}$-i.i.d. family of Rademacher random variables seems to suggest that at least something of the form \begin{equation*} \forall T \in \mathbb{N}, \forall \delta \in \Big(0,\frac{1}{10}\Big), \qquad \mathbb{P}\bigg[\bigcup_{t\in\{1,\dots,T\}} \bigg\{\frac{1}{t} \sum_{s=1}^t X_s \ge \sqrt{\frac{2}{t}\log\Big(\frac{1}{\delta}\Big)}\bigg\}\bigg] \le \delta\cdot \log(T) \end{equation*} holds, and even something better than this (maybe, even an upper bound of the form $O(\delta \cdot \log\log T)$).

Bob
  • 5,995
  • The problem with the union bound is that it is really bad when the events are highly correlated (which they are here), it performs much better for independent things. An idea could be, if your union is $\bigcup_{t=1}^T A_t$, to look at instead the union $\bigcup_{k=0}^{\log(T)} A_{2^t}$ and first to argue that if $A_{2^k}$ and $A_{2^{k+1}}$ then chances are that $A_{t}$ for $2^k\leq t < 2^{k+1}$ (and this is because they are highly correlated). And then you can use the union bound on $\bigcup_{k=0}^{\log(T)} A_{2^t}$ and here they are much less correlated. You can also use another function. – P. Quinton May 12 '22 at 06:01

2 Answers2

6

The dependence on $T$ is logarithmic, see $(4)$ below. I could not find a precise reference, so I give an argument below, making sure to get explicit constants. The argument is adapted from the standard proof of the law of the iterated logarithm, that can be found, e.g., in the textbook by Durrett, Probability Theory and examples, and in [1].

Let $S_t:=\sum_{j=1}^t X_j$ and $\lambda:=\sqrt{2\log (1/\delta)}$, so that $e^{-\lambda^2/2}=\delta$. The goal is to estimate $\mathbb{P}(A)$ , where $$A:=\bigcup_{t\in\{1,\dots,T\}} \bigg\{S_t \ge \lambda\sqrt{t} \bigg\} \,.$$ We will assume that $\lambda^2 \ge 2e$, otherwise $\mathbb{P}(A)$ will be close to $1$. We first prove an upper bound. Fix $q=e^\beta>1$ and let $$\tau_n:=\inf\left\{ t \in [q^n, q^{n+1}] :\, S_t\geq \lambda q^{n/2} \right\},$$ with the convention that the infimum of the empty set is $\infty$. Observe that $$A \subset \bigcup_{n=0}^{N-1} \{\tau_n \le q^{n+1}\}\,, \quad \text{where}\quad N=\Bigl\lceil\frac{\log T}{\log q}\Bigr\rceil \,. \tag{1}$$ Next, apply the Azuma-Hoeffding inequality to the martingale $M_t=S_{t\wedge\tau_n}$, to infer that $$\mathbb{P}(\tau_n \le q^{n+1})= \mathbb{P}\Bigl(M_{q^{n+1}} \ge \lambda q^{n/2} \Bigr) \le \exp\Bigl(-\frac{\lambda^2}{2q}\Bigr) \,.$$ Recalling that $q=e^\beta$, we deduce from $(1)$ that $$ \mathbb{P}(A) \le \Bigl\lceil\frac{\log T}{\beta}\Bigr\rceil \cdot \exp\Bigl(-\frac{\lambda^2}{2e^\beta}\Bigr) \,. \tag{2}$$ To minimize the right-hand side, we take $\beta \in (0,1]$ that satisfies $$ {e^\beta}/{\beta}=\lambda^2/2=\log(1/\delta) \,.\tag{3}$$ To get a concrete bound, we note that $(3)$ gives $$ \lambda^2/2 < {1}/{\beta}+2 \,, \quad \text{so}$$ $$\exp\bigl(- {1}/{ \beta}\bigr) < \exp( -\lambda^2/2+2) =e^2\delta \,.$$ Thus $(2)$ yields that $$ \mathbb{P}(A) \le \bigl(1+ \frac{\lambda^2}{2} \log T \bigr) \cdot \exp\Bigl(-\frac{1}{ \beta}\Bigr) < e^2\delta \log(e/\delta) \cdot \log T \,. \tag{4}$$

$ \quad$

To show that the dependence on $T$ is indeed logarithmic (and not $f(\delta)\log \log T$ ), we give a rough lower bound in the special case that $X_i$ are independent Rademacher variables (i.e., take values $\pm 1$), although it works more generally if they satisfy a CLT. We also assume that $1<\lambda<T^{1/8}$, although this can be relaxed as well.

Fix any $\epsilon>0$ and an integer $r>1$. Consider the independent events $$D_n=\left\{ S_{r^n} -S_{r^{n-1}} > \lambda r^{n/2} \right\}.$$ If $r^n >\sqrt{T}$, then by a version of the central limit theorem (That can be found e.g. in the book by Feller) and a lower tail bound for the Gaussian, $$\mathbb{P}(D_n) >\exp\Bigl(-\frac{\lambda^2 r}{2r-3}\Bigr)>\delta^{1+\epsilon} \,,$$ provided that $r$ is large enough so that $r/(r-3)<1+\epsilon$. Let $$N_0=\Bigl\lceil\frac{\log T}{2\log r}\Bigr\rceil \,.$$ If $\delta^{1+\epsilon} N_0<1$, then $$ \mathbb{P}\Bigl(\bigcup_{n=N_0}^{2N_0-2} D_n \Bigr) \ge 1-(1-\delta^{1+\epsilon})^{N_0-1} > (N_0-1)\delta^{1+\epsilon}/2 \,.$$ If $\sigma$ is the last $n \in [N_0,2N_0-2]$ such that $D_n$ holds, (which is well defined if the union in last display is nonempty), then $A$ will hold provided $S_{r^{\sigma-1}} \ge 0$, and the latter event has probability at least $1/2$ even conditional on $\sigma$. Thus we conclude that $$\mathbb{P}(A) \ge (N_0-1)\delta^{1+\epsilon}/4 \,.$$

[1] Brownian motion, by Peter Mörters and Yuval Peres. Cambridge University Press, 2010. https://yuvalperes.com/brownian-motion/

Yuval Peres
  • 22,782
1

I'm not sure whether the following counts as better, but it's easy to get a bound of $O\left(\delta\sqrt{\log(1/\delta)}\cdot \sqrt{T}\right)$.

We have, for any $1 \le t \le T$, $$\mathbb{P}\left[\frac{1}{t}\sum_{s=1}^t X_s \ge \sqrt{\frac{2}{t}\log\left(\frac{1}{10\delta}\right)}\right] \le 10\delta.$$ Note $$\sqrt{\frac{2}{t}\log\left(\frac{1}{\delta}\right)}-\sqrt{\frac{2}{t}\log\left(\frac{1}{10\delta}\right)} = \frac{\frac{2}{t}\log 10}{\sqrt{\frac{2}{t}\log(\frac{1}{\delta})}+\sqrt{\frac{2}{t}\log(\frac{1}{10\delta})}} \sim \frac{c}{\sqrt{t\log(1/\delta)}},$$ (where $c = \sqrt{2}\log 10$, I think). Therefore, if $$\frac{1}{t}\sum_{s=1}^t X_s \ge \sqrt{\frac{2}{t}\log\left(\frac{1}{10\delta}\right)}$$ for some $t \in [T]$, then $$\frac{1}{t'}\sum_{s=1}^{t'} X_s \ge \sqrt{\frac{2}{t'}\log\left(\frac{1}{\delta}\right)}$$ for all $t' \in \left[t,t+\frac{c\sqrt{t}}{\sqrt{\log(1/\delta)}}\right]$. There are $m := C\sqrt{T\log(1/\delta)}$ values of $t$, say $t_1,\dots,t_m$, for which $[T] = \cup_{j=1}^m \left[t_j,t_j+\frac{c\sqrt{t_j}}{\sqrt{\log(1/\delta)}}\right]$. Therefore, \begin{align*} \mathbb{P}\bigg[\bigcup_{t\in\{1,\dots,T\}} \bigg\{\frac{1}{t}\sum_{s=1}^t X_s \ge \sqrt{\frac{2}{t}\log\Big(\frac{1}{\delta}\Big)}\bigg\}\bigg] &\le \mathbb{P}\bigg[\bigcup_{j=1}^m \bigg\{\frac{1}{t_j}\sum_{s=1}^{t_j} X_s \ge \sqrt{\frac{2}{t_j}\log\Big(\frac{1}{10\delta}\Big)}\bigg\}\bigg] \\ &\le m\cdot 10\delta \\ &= O\left(\delta\sqrt{\log(1/\delta)}\cdot \sqrt{T}\right). \end{align*}

mathworker21
  • 35,247
  • 1
  • 34
  • 88