As a consequence to the Talagrand concentration inequality, it is well known that for a measurable space $(S,\mathcal{S})$ and an i.i.d. sample $X_1,...,X_n$ of $S$-valued random variables, if $\mathcal{F}$ is a countable set of measurable functions with $\|f\|_\infty\leq U$ for every $f\in\mathcal{F}$, then \begin{equation} \mathbb{P}\left(\sup_{f\in\mathcal{F}}|S_n(f)|\geq\mathbb{E}\sup_{f\in\mathcal{F}}|S_n(f)|+t\right)\leq\exp\left(-\frac{t^2}{2\nu_n+2tU/3}\right) \end{equation} where $\nu_t=2U\mathbb{E}\sup_{f\in\mathcal{F}}|S_n(f)|+n\sigma^2$, $\sigma^2=\mathbb{E}\sup_{f\in\mathcal{F}}f^2(X_1)$ and $S_n(f)=\sum_{i=1}^nf(X_i)$ provided $\mathbb{E}f(X_1)=0$. This specific inequality is known as Bousquet’s version to the Talagrand inequality, see for example theorem 3.3.9 on page 156 in this book.
Question: In contrast to the above situation, I have a sequence $X_1,...,X_n$ of dependent random variables, in particular $\beta$-mixing random variables (which can be assumed to decay at an exponential rate). Therefore, I am asking if there is an extension to the above for $\beta$-mixing sequences. Here, we can assume that $\mathcal{F}$ is bounded and satisfies nice VC-type properties, including having a bounded uniform entropy integral.
So far, I found results that give a Talagrand alike bound for the pointwise random variable $S_n(f)$ for some $f\in\mathcal{F}$, for example in this paper, but I am specifically looking for the result when we take the supremum into account.
Another approach would be to use a repeated argument of the Goldstein coupling lemma and a chaining argument, which if worked out completely gives an approximate bound of \begin{equation} \mathbb{P}\left(\sup_{f\in\mathcal{F}}|S_n(f)|\geq\mathbb{E}\sup_{f\in\mathcal{F}}|S_n(f)|+t\right)\leq C\log_2(n)\exp\left(-\frac{t^2}{2\nu_n+2tU/3}\right) \end{equation} for a constant $C>0$ if we assume geometric absolute regularity, that is $\beta(j)=O(\exp cj)$ for a fixed $c<0$. Please keep in mind that this result is work in progress, hence it could contain mistakes.
The idea here being that one employs approximations $\pi_k(f)$ of $f$ that satisfy $\|f-\pi_k(f)\|\leq2^{-k}U$ for each $k=0,...,K$ and then to use a telescoping sum to rewrite $S_n(f)$ for a $K$ which depends on the decay rate of the mixing coefficient and satisfies $K\to\infty$ as $n\to\infty$. Indeed, besides the approxmiations to $f$, we also need to account for the approximations made to $X_1,...,X_n$ in terms of the Goldstein lemma, denoted $X_1^k,...,X_n^k$ for $k=1,...,K$, which yields a total chaining argument of \begin{equation} S_n(f-\pi_0(f))=S_n^K(f-\pi_K(f))+(S_n-S_n^0)(f)+\sum_{k=1}^KS_n^{k-1}(\pi_k(f)-\pi_{k-1}(f))+\sum_{k=1}^K(S_n^{k-1}-S_n^k)(f-\pi_k(f)) \end{equation} However, because of the chaining argument, I had to repeatedly split the supremum norm of the entire chain above, which then results in the $\log_2(n)$ factor in front of the exponential (because of the exponential decay rate, $K$ can be choosen $\log_2(n)$). Note here that $2U$ is the diameter of $\mathcal{F}$ in the $\|\cdot\|$-norm and so $f-\pi_0(f)$ can be taken as $f$ without loss of generality.
The idea of using $K$ approximations of the original sequence using the Goldstein lemma comes from ideas in chapter 8.3 of the book Asymptotic Theory of Weakly Dependent Random Processes, specifically proposition 8.2 on page 138 of the same book.
Now intuitively, as $n$ grows, a stationary $\beta$-mixing sample $X_1,...,X_n$ should start to resemble an i.i.d. sample $Y_1,...,Y_n$ with the sample unconditional law as $X_1$. Therefore, I hypothesise that, adjusted with a suitable variance factor, we should be able to replicate the bound obtained in the first equation above, without any factor depending on $n$ in front of the exponential. However, so far, I have not been able to find anything.