3

Suppose we know that an estimator $\hat{\theta}_n$, which is a function of a random sample $X = (X_1, \dots, X_n)$, converges in probabiltiy to some constant $\theta$, i.e., $$\forall \varepsilon > 0: \lim_{n \rightarrow \infty} P( | \hat{\theta}_n - \theta | > \varepsilon ) = 0$$

Now, we perform bootstrapping, i.e., resampling with replacement from $X$. Thereby, we consider the data as given, and thus, $\hat{\theta}_n$ becomes a non-random sequence conditional on $X$.

Can we show that $\hat{\theta}_n$ given $X$ converges to $\theta$ in a deterministic sense? That is, $$\lim_{n \rightarrow \infty} \hat{\theta}_n = \theta$$

The reason for asking this question is that I want to use some properties of $\theta$ in a proof about a bootstrapped test statistic. Any help is appreciated.

Lime91
  • 161
  • Doesn't the resampling still give you randomness? Suppose you sampled one draw with the value $0$ and one with $1$. Conditional on that, you get an iid random sequence of 0s and 1s and not something deterministic. – Michael Greinecker Aug 14 '23 at 20:18
  • @MichaelGreinecker In the resampling world, $\hat{\theta}_n$ becomes a deterministic value when computed on the full original sample. This is what I mean by "conditional on $X$". I'm sorry for causing confusions around this. – Lime91 Aug 14 '23 at 22:10
  • So $X$ should represent an infinite sequence of realizations? – Michael Greinecker Aug 14 '23 at 23:12
  • Yes, we should regard $X$ as an infinite sequence of realizations, growing in size. – Lime91 Aug 14 '23 at 23:28
  • The bootstrap thing just provides background for my question since it is common to condition on the observed data there. The real question is what happens to this probabilistic convergence when we do so. – Lime91 Aug 15 '23 at 07:56

1 Answers1

2

I don't think you can show this in general. Bootstrapping simply replaces the true distribution $F$ with the empirical distribution $F_n$, which is a discrete distribution.

In this case, we have $q = \hat \theta_n(X)$ is taken to be the true underlying value of $\theta$ in a bootstrap. If we take larger and larger bootstrap samples $B_i$ from $F_n$ we'd expect the estimator $\hat \theta_n(B_i)$ to converge in probability to $q$.

What you are looking for is that the sequence $\hat \theta(B_i)$ converges to $q$ almost surely. I think you can prove this in some cases, but it is not implied by the convergence in probability of the estimator itself.

  • Thanks for your answer. Would you say that, in this situation, almost sure convergence is the closest we can get to deterministic convergence of a sequence? – Lime91 Aug 16 '23 at 09:27
  • Almost sure convergence as a probabilistic counterpart to standard deterministic convergence as it is taught in fundamental calculus courses? – Lime91 Aug 16 '23 at 09:29
  • 1
    @Lime91 yes, the closest you can get to the calculus notion of convergence is almost sure convergence -- which means that almost every sequence of estimators will converge in the classical sense -- we say almost sure because we allow the possibility that there is a set of sequences that do not converge but this set will have probability 0. –  Aug 16 '23 at 14:12
  • @Lime91 also, from the title, not sure which sequence is supposed to be deterministic -- your sample and boostrap methods both have randomness. –  Aug 16 '23 at 14:12
  • The background is that I want to show that some bootstrap statistic converges (in probability) to its non-bootstrap counterpart. In my case, $\hat{\theta}_{n}(X)$ happens to be the population variance of the bootstrap statistic. Asymptotically, I need this variance to match the population variance of the non-bootstrap statistic. From a bootstrap perspective, we usually consider $X$ as given, i.e., non-random. So I thought that I needed to show deterministic convergence here, which is hard as I don't know the limit $\theta$. – Lime91 Aug 16 '23 at 17:25
  • Or, would you say that nothing is to show in my case as $\hat{\theta}_n \overset{P}{\rightarrow} \theta$ is already established? If so, why can we omit the bootstrap perspective "given the data $X$" here? – Lime91 Aug 16 '23 at 17:28
  • 1
    @Lime91 X is given in the sense that your bootstrap samples are selected from X. However, this isn’t deterministic — each sample is random draw of n points from the data X — I don’t know what sequence you are thinking of that is deterministic. –  Aug 17 '23 at 00:20
  • 1
    @Lime91 in general, your bootstrap statistic will not equal the population statistic -- are you imagining a sequence of bootstrap statistics calculated from ever larger samples? If so, that has nothing to do with bootstrap, you are just calculating a sequence of statistics –  Aug 17 '23 at 01:22
  • Thanks for your input, we're having a misunderstanding here :-) I'm wondering about the convergence of the population variance of the bootstrap statistic, which equals $\hat{\theta}_n(X)$ in my case? At least asymptotically, the bootstrap statistic distribution should match the non-bootstrap statistic distribution. Otherwise I guess bootstrap won't be a sensible way of statistical testing. – Lime91 Aug 17 '23 at 07:34
  • So indeed, we're imagining sequences of "growing" $X$, for which we'd like to show that some convergence $\hat{\theta}_n(X) \rightarrow \theta$ holds. I guess we're done when we can show that this convergence holds for each sequence. The question remains what type of converge we need? – Lime91 Aug 17 '23 at 07:46
  • 1
    @Lime91 gotcha -- so this is just about convergence of estimators (bootstrap really doesn't come into play). Are you are asking if convergence in probability implies deterministic converge of each sequence of realizations? If so, the answer is no -- convergence in probability is a weaker guarantee (look up strong vs weak LLN). The closest is almost sure convergence, which says that "almost all" such sequence do converge in the deterministic sense (we have to allow for a subset of sequences with probability zero, hence "almost sure"). –  Aug 17 '23 at 16:15
  • Now we are talking on common ground :-) Yes, this was exactly my question and thanks for answering it here. However, this immediately gives rise to another question: namely, do I need deterministic convergence at all to show that the bootstrap statistic distribution asymptotically matches the non-bootstrap one? Wouldn't it be weird to say that some distribution converges in probability (or a.s.) to another one? – Lime91 Aug 17 '23 at 18:48
  • 1
    @Lime91 No you don't. If all you want to show is that the bootstrap distribution of a statistic converges to the distribution of the statistics when sampled from the true population then you only need to show that the the sampling distribution of the boostrapped statistic converges in distribution to the sample distribution of the statistic from the true population. –  Aug 17 '23 at 21:15
  • Interesting. I'm not used to the concept of distributions converging in distribution. I only had random variables converging in some probabilistic sense so far. Essentially, however, empirical distributions are random variables. I guess this is the key to understanding here. Still, it's not easy to find the correct point of view as the empirical distribution can be perceived as non-random when data $X$ is given. I'll have to let that sink for a while. – Lime91 Aug 18 '23 at 07:15
  • 1
    @Lime91 The difference is between a statistical view and probabilistic view. In probability, we deal with distributions and convergence. However, after collecting data , all the probability/random ness goes away. Random variables become numbers, empirical distributions become specific (step) functions. The connection between these two is what drives inference.

    There is a very powerful theorem that basically justifies why samples of a population are generally a great way to understand any aspect of the population: https://en.wikipedia.org/wiki/Glivenko%E2%80%93Cantelli_theorem

    –  Aug 18 '23 at 12:51
  • 1
    @Lime91 also, if this helps clarify, you can revise my sentence earlier from

    " you only need to show that the the sampling distribution of the boostrapped statistic converges in distribution to the sample distribution of the statistic from the true population."

    to

    "you only need to show that the the boostrapped statistic $\hat \theta_n^*$ converges in distribution to the statistic formed from the true population $\hat \theta_n$."

    I.e., $\hat \theta_n^* \xrightarrow{d} \hat \theta_n$

    –  Aug 18 '23 at 12:56