6

This is a question that I posted on stats.stackexchange.com but since I received no satisfying answer but still the question was upvoted by many, I want to use the oppurtunity to further extend the question and hopefully address a larger audience; The original question can be found here:

https://stats.stackexchange.com/questions/432396/are-minx-1-ldots-x-n-and-minx-1y-1-ldots-x-ny-n-independent-for-n-to

Assume that we have given two continuous iid random variables $X$ and $Y$ with support $[1,c)$, where $c$ is some constant greater than one. (The exact value is probably unimportant anyway) Now assume I have a given iid sample $X_1, \ldots,X_n$ and $Y_1, \ldots,Y_n$ (so absolutely no dependence here).

Imagine that I know that:

$$(1): \mathbb P \left(\frac{\min(X_1,\ldots,X_n)-a_n}{b_n}\leq x_1\right) \sim F(x_1), \text{ for }n \to \infty,$$

where $F(x_1)$ is some non-degenerate cdf; Given some weak condition, it is usually quite easy to derive sequences $a_n$, $b_n$ and the limit distribution $F$, since it is very much connected to Extreme Value theory; Moreover, I know

$$(2):\mathbb P \left(\frac{\min(X_1Y_1,\ldots,X_nY_n)-\bar a_n}{\bar b_n}\leq x_2\right) \sim G(x_2), \text{ for }n \to \infty,$$

Is it true that then it also follows that

$$(3):\mathbb P \left(\frac{\min(X_1,\ldots,X_n)-a_n}{b_n}\leq x_1,\frac{\min(X_1Y_1,\ldots,X_nY_n)-\bar a_n}{\bar b_n}\leq x_2\right) \sim F(x_1) G(x_2),$$

for $n$ to infinity?

At first I thought this cannot work since $X$ and $XY$ are obviously absolutely dependent; But then I thought the following:

  1. The probability that the minimum of $X_1,\ldots,X_n$ and the minimum $X_1Y_1,\ldots,X_nY_n$ is obtained in the same realization converges to zero for n to infinity
  2. Since the sample itself is iid, the minima should be kinda independent;

So I don't know if this is true; Unfortunately, I cannot think of a counterexample and I also have no idea how to prove it; The only thing I thought of to prove 1. is:

The probability, that the minimum is obtained in the same realization is given by

\begin{align*} &\sum_{i=1}^n\mathbb P\big(X_i=\min(X_1,\ldots,X_n), X_iY_i=\min(X_1Y_1,\ldots,X_nY_n)\big) \\ \leq &\sum_{i=1}^n\mathbb P\big(X_i=\min(X_1,\ldots,X_n), Y_i \leq \min(X_1Y_1,\ldots,X_nY_n)\big) \\ = &n \cdot \mathbb P\big(X_i=\min(X_1,\ldots,X_n)\big) \mathbb P\big( Y_i \leq \min(X_1Y_1,\ldots,X_nY_n) \vert X_i=\min(X_1,\ldots,X_n) \big) \\ = &n \cdot 1/n \mathbb P\big( Y_i \leq \min(X_1Y_1,\ldots,X_nY_n) \vert X_i=\min(X_1,\ldots,X_n) \big) \end{align*}

where the latter probability converges to zero, since $\min(X_1Y_1,\ldots,X_nY_n)$ gets arbitrarily close to 1 for $n \to \infty$ (and the condition does not seem to change that). Therefore, the probability, that the minimum is realized in the same observation is something like $n \cdot 1/n \cdot o(1)=o(1)$, so converges to zero...

Now this is obviously not a rigorous proof; So is there anyone smarter or more knowledgeable than me with an idea of a proof or a counterexample why my idea is wrong?


Since antkam was commenting something, I want to give a brief look into Extreme Value Theory:

Most people probably know the central limit theorem:

$$\frac{S_n-n\mu}{\sqrt n \sigma} \xrightarrow[]{D}N(0,1) $$

So something similar, we can get for maxima which is based on Extreme Value Theory; It is known that

$$\frac{\vee X-a_n}{b_n}$$

can only converge to one of the three extreme value distributions (or to some constant), where $\vee X= \max(X_1,\ldots,X_n)$. So for minima, we can use this and also find limit distributions and those sequences, such that

$$\frac{\land X-a_n}{b_n}$$ converges to some distribution; Now what you, antkam, probably wanted to say:

If $\frac{\land XY−\bar a_n}{\bar b_n}\leq x_2$, then it holds that $\land XY \leq x_2 \bar b_n+\bar a_n$ and therefore in particular it holds that: $\land X \leq x_2 \bar b_n+\bar a_n$ (Since $Y \geq 1$) Now $\frac{\land X− a_n}{ b_n}\leq x_1$ is equivalent to $\land X \leq x_1 b_n + a_n$.

So if $x_2 \bar b_n+\bar a_n \leq x_1 b_n+ a_n$, then $\frac{\land XY−\bar a_n}{\bar b_n}\leq x_2$ already implies $\frac{\land X− a_n}{ b_n}\leq x_1$;

That was by the way also a way, someone wanted to prove that this in incorrect; Therefore, if we solve it for $x_1$ we get:

$$ x_1 \geq\frac{ x_2 \bar b_n+\bar a_n- a_n}{b_n}$$

So if $x_1$ is greater than the term, then it certainly is incorrect; The problem is, that we do not know these sequences and therefore, the right term could (and probably will) go to infinity and then it does not work, obviously. So if you want to prove it like that, there are basically just 2 ways I can think of:

  1. You take some distribution for $X$ and $Y$, calculate the corresponding sequences and show that the right term indeed does not go to infinity;

  2. You take some other sequences $a_n$, $b_n$, $\bar a_n$, $\bar b_n$(I never said that it only works for the sequences such that we can get some nice limit distribution) but then the limit of $\frac{\land X-a_n}{b_n}$ would somewhat converge in probability only to a fixed number, or it goes to infinity or -infinity or something like that; So the cdf of $\frac{\land X-a_n}{b_n}$ or also $\frac{\land XY-\bar a_n}{\bar b_n}$ would only have values 0 and 1...

Hope this helps some people to understand the problem a bit better... :-)


@ Sangchul Lee: Thank you very much, for your answer; This is actually very interesting, because given the uniform distribution,then $X-1$ and $\ln(X)$ is regularly varying at zero with exponent $\alpha=1$;

This is equivalent to $1/(X-1)$ or $1/\ln(X)$ being regularly varying at infinity with exponent $\alpha=-1$; Using well-known results, we can show that $\frac{\lor (1/\ln(X))}{b_n}$ then converges to a Frechet distribution with exponent $\alpha$ and by the close connection we know that

$$\mathbb P\left(\frac{\land \ln(X)}{b_n}\leq x\right)$$

or also

$$\mathbb P\left(\frac{\land X-1}{b_n}\leq x\right)$$

converges to $(1-\phi_{\alpha}(1/x))$ and since $XY-1$ is regularly varying with exponent $2\alpha$, you get this convergence with $e^{-t^2}$ for the tail;

Which brings me back to your idea; I suspect that it might be possible to prove it more generally for all regularly varying random variables, but I need to think about it tomorrow...

But it is definitely a very interesting and smart answer, thank you so much for it... :-)


Okay, I give things a try, although I did not manage to use the regular variation and I have no idea how I could use it; Also, I am not sure if this is a proper proof; The only thing I am going to use is that $\mathbb P(Y \in [1,a])\to 0$ for $a \downarrow 1$ and $\mathbb P(X \in [1,a])\to 0$ for $a \downarrow 1$. (Does this follow from the continuity or can we construct some weird distribution that is continous but still does not satisfy this?)

Anyway: We generally assume $X$ and $Y$ to be independent but not necessarily identically distributed; We know that

$$\lim\limits_{n \to \infty} \mathbb P \left(\frac{\land X-1}{b_n}> x_1 \right) = \exp(-x_1^{\alpha_1})$$

and

$$\lim\limits_{n \to \infty} \mathbb P \left(\frac{\land XY-1}{\bar b_n}> x_2 \right) = \exp(-x_2^{\alpha_2})$$

So these are our assumptions; Then it follows that:

$$\lim\limits_{n \to \infty} \mathbb P \left(\frac{\land X-1}{b_n}> x_1 \right) =\lim\limits_{n \to \infty} \mathbb P \left(\frac{ X-1}{b_n}> x_1 \right)^n= \exp(-x_1^{\alpha_1})$$

and therefore, in particular:

$$\lim\limits_{n \to \infty} \mathbb P \left(\frac{ X-1}{b_n}> x_1 \right)=\left(1-\frac{x_1^{\alpha_1}}{n}\right)$$

respectively

$$\lim\limits_{n \to \infty} \mathbb P \left(\frac{ X-1}{b_n}\leq x_1 \right)=\frac{x_1^{\alpha_1}}{n}$$

and also it holds that

$$\lim\limits_{n \to \infty} \mathbb P \left(\frac{ XY-1}{\bar b_n}> x_2 \right)=\left(1-\frac{x_2^{\alpha_2}}{n}\right)$$

respectively

$$\lim\limits_{n \to \infty} \mathbb P \left(\frac{ XY-1}{\bar b_n}\leq x_2 \right)=\frac{x_2^{\alpha_2}}{n}$$

Now we have:

$$\lim\limits_{n \to \infty}\mathbb P \left( \frac{\land X-1}{ b_n}> x_1 \right) \mathbb P \left(\frac{\land XY-1}{\bar b_n}> x_2 \right)=\exp(-x_1^{\alpha_1})\exp(-x_2^{\alpha_2})=\exp(-x_1^{\alpha_1}-x_2^{\alpha_2})$$

and

$$\lim\limits_{n \to \infty}\mathbb P \left(\frac{ X-1}{ b_n}> x_1,\frac{ XY-1}{\bar b_n}> x_2 \right)\\ =\lim\limits_{n \to \infty}\left(1- \mathbb P \left(\frac{ X-1}{ b_n}\leq x_1\right)-\mathbb P \left(\frac{ XY-1}{\bar b_n}\leq x_2\right)+\mathbb P \left(\frac{ X-1}{ b_n}\leq x_1,\frac{ XY-1}{\bar b_n}\leq x_2\right)\right)\\ =\lim\limits_{n \to \infty}\left(1-\frac{x_1^{\alpha_1}}{n}-\frac{x_2^{\alpha_2}}{n} +\mathbb P \left(\frac{ X-1}{ b_n}\leq x_1,\frac{ XY-1}{\bar b_n}\leq x_2\right)\right)$$

Moreover, we have

$$\mathbb P \left(\frac{ X-1}{ b_n}\leq x_1,\frac{ XY-1}{\bar b_n}\leq x_2\right) = \mathbb P \left(\frac{ X-1}{ b_n}\leq x_1\right) \mathbb P \left(\frac{ XY-1}{\bar b_n}\leq x_2 \bigg \vert \frac{ X-1}{ b_n}\leq x_1\right) \\ \leq \frac{x_1^{\alpha_1}}{n} \mathbb P \left(\frac{Y-1}{\bar b_n}\leq x_2 \bigg \vert \frac{ X-1}{ b_n}\leq x_1\right)=\frac{x_1^{\alpha_1}}{n} \mathbb P \left(\frac{Y-1}{\bar b_n}\leq x_2\right)=\frac{x_1^{\alpha_1}}{n} o(1), $$

since $\bar b_{n} \to 0$ for $n \to \infty$. Therefore we can see that

$$\lim\limits_{n \to \infty}\mathbb P \left(\frac{\land X-1}{ b_n}>x_1,\frac{ XY-1}{\bar b_n}> x_2 \right)\\ =\lim\limits_{n \to \infty}\left(1-\frac{x_1^{\alpha_1}}{n}(1-o(1))-\frac{x_2^{\alpha_2}}{n} \right)^n = \exp\left(-x_1^{\alpha_1}-x_2^{\alpha_2}\right)$$

Now I am curious: Is this a valid proof?

Mark
  • 65
  • 1
    I have never heard of Extreme Value theory, so my apologies if this question is stupid :) If your conjecture is true, then $P(\min(X_i) - a_n > x_1 b_n, \min(X_i Y_i) - \bar{a}_n \le x_2 \bar{b}_n) \sim (1- F(x_1)) G(x_2)$, right? But we can find $x_1, x_2$ so the RHS $> 0$, and yet intuitively it seems plausible that if $\min(X_i)$ is too big then $\min(X_i Y_i)$ cannot be too small, as in the prob should be $0$. Hmm, perhaps... does this impose stringent conditions on the $a_n, b_n, \bar{a}_n, \bar{b}_n$? – antkam Oct 22 '19 at 05:42
  • 1
    I added something for you, so you can understand it better :-) – Mark Oct 22 '19 at 10:04
  • 2
    thank you for the lesson! :D yeah, i suspected my idea "perhaps" "imposes" some "stringent" conditions on $a_n, b_n, \bar{a}_n, \bar{b}_n$, and you've shown that these conditions aren't very stringent at all. In my mind I had imagined $a_n, \bar{a}_n$ both $\to 1$, so I thought it might still be possible to find a "counter-example" pair $x_1, x_2$ in range, but the RHS of your inequality clearly can $\to \infty$ based on $b_n, \bar{b}_n$ alone. – antkam Oct 22 '19 at 12:48
  • Yeah, indeed it is not that easy.. :-( But for all other people reading this: I would be very happy about comments on 1. Ideas how to prove or disprove it (if it is too much work for you, just tell me an idea and I can try to make something out of it) 2. What do you think about my proof sketch? Isn't it kinda true because of that or is there a major flaw in it? 3. Would you generally think it is true or false? – Mark Oct 22 '19 at 13:35

1 Answers1

5

Here is a partial answer when both $X_k$ and $Y_k$ are uniformly distributed in $[1, 2]$. I guess it generalizes to a broader class of distribution without much hassle.


We begin by noting that, for each $s, t \geq 0$, define

\begin{align*} \newcommand{\Area}{\operatorname{Area}} \mathcal{A}_n(s) &= \Bigl\{ (x, y) \in [1, 2]^2 : x < 1 + \frac{s}{n} \Bigr\}, \\ \mathcal{B}_n(s) &= \Bigl\{ (x, y) \in [1, 2]^2 : xy < 1 + \sqrt{\frac{2}{n}} \, t \Bigr\}. \end{align*}

Then it follows that $\Area(\mathcal{A}_n(s)) \sim \frac{s}{n}$ and $\Area(\mathcal{B}_n(s)) \sim \frac{t^2}{n}$, and so, we get

\begin{align*} \mathbb{P} \biggl( \frac{\min\{X_1,\cdots,X_n\}-1}{1/n} \geq s \biggr) &= \mathbb{P}\bigl((X_k, Y_k) \notin \mathcal{A}_n(s) \text{ for all } k = 1, \cdots, n\bigr) \\ &= \biggl( 1 - \Area(\mathcal{A}_n(s)) \biggr)^n \xrightarrow[n\to\infty]{} e^{-s} = 1 - F(s) \end{align*}

and similarly

\begin{align*} \mathbb{P} \biggl( \frac{\min\{X_1 Y_1,\cdots,X_n Y_n\}-1}{\sqrt{2/n}} \geq t \biggr) &= \mathbb{P}\bigl((X_k, Y_k) \notin \mathcal{B}_n(t) \text{ for all } k = 1, \cdots, n\bigr) \\ &= \biggl( 1 - \Area(\mathcal{B}_n(t)) \biggr)^n \xrightarrow[n\to\infty]{} e^{-t^2} = 1 - G(t). \end{align*}

Finally, it follows that the probability of the joint event is

\begin{align*} &\mathbb{P} \biggl( \biggl\{ \frac{\min\{X_1,\cdots,X_n\}-1}{1/n} \geq s \biggr\} \cap \biggl\{ \frac{\min\{X_1 Y_1,\cdots,X_n Y_n\}-1}{\sqrt{2/n}} \geq t \biggr\} \biggr) \\ &= \mathbb{P}\bigl((X_k, Y_k) \notin \mathcal{A}_n(s) \cup \mathcal{B}_n(t) \text{ for all } k = 1, \cdots, n\bigr) \\ &= \biggl( 1 - \Area(\mathcal{A}_n(s)) - \Area(\mathcal{B}_n(t)) + \Area(\mathcal{A}_n(t) \cap \mathcal{B}_n(t)) \biggr)^n. \end{align*}

But it is easy to check that $\Area(\mathcal{A}_n(t) \cap \mathcal{B}_n(t)) = \mathcal{O}(n^{-3/2})$, hence the above converges to

\begin{align*} = \biggl( 1 - \frac{s+o(1)}{n} - \frac{t^2+o(1)}{n} + \mathcal{O}(n^{-3/2}) \biggr)^n \xrightarrow[n\to\infty]{} e^{-s-t^2} = (1 - F(s))(1 - G(t)). \end{align*}

Therefore the limiting joint distribution factors and the marginal distributions are independent.

Sangchul Lee
  • 181,930
  • 1
    +1 Under the asumption that the $X_i$ and $Y_i$ have the same distribution (unclear in the OP), it could be possible to relate the constants $\bar{a}_n$ and $\bar{b}_n$ to $a_n$ and $b_n$ ; then maybe the proof could be generalised(?) With $a_n = 1$ and $b_n = 1/n$ you found $\bar{a}_n = 1$ and $\bar{b}_n = \sqrt{2/n}$. However I am not sure that the result holds if the distributions of $X$ and $Y$ are not identical: if the distribution of the $Y_i$ is close to a Dirac at $y =1$ the r.vs $\min{X_i}$ and $\min{X_i Y_i}$ are very dependent. – Yves Oct 23 '19 at 09:38
  • 1
    Sangchul Lee: Thank you so much for your answer - it is a very sophisticated way of thinking; I edited the original question about ideas on how you could extend this - feel free to read it; @Yves: Can you go a bit more into detail what exactly you mean?? – Mark Oct 23 '19 at 22:53
  • @Yves, As soon as the distribution remains continuous, I think that the 'width' only determines the speed at which two minimums decouple. I am more concerned about the regularity of the distribution (in terms of CDF, for instance) around its minimum, and OP suggested a nice idea that we may restrict our attention to regularly varying cases. – Sangchul Lee Oct 23 '19 at 22:58
  • @Mark, Glad it helped! It is indeed a fun problem to think about. – Sangchul Lee Oct 23 '19 at 23:03
  • 1
    Yes focusing on RV can help. To better fit to the Extreme Value litterature considering $X^\star:=1/X$ and $Y^\star :=1/Y$ we get r.vs with upper end-point $\omega =1$ and we are interested in the max of $X_i^\star$ and of $X_i^\star Y_i^\star$. The rv.s are then either in the Weibull (type III) or in the Gumbel (type I) Domain of Attraction. Ex. of type I: $F_{X^\star}(x^\star) = 1 - \exp{- x^\star/(1 - x^\star)}$. If $X^\star$ is type I and $Y^\star$ is type III large values of $X^\star_i Y^\star_i$ are mainly "due to $X_i^\star$", making the result counter-intuitive. – Yves Oct 24 '19 at 06:50
  • @Yves Are you sure, that if you take $1/X$ that you are in the max domain of attraction of Gumbel or Weibull? $1/X$ would be regularly varying at upper point $x=1$ with exponent $-\alpha<0$ and if a random variable is regularly varying at infinity with exponent $-\alpha<0$, then you are in the max domain of attraction of the Frechet distribution; That's why it surprises me.. However, it would be anyway unusual to transform to $1/X$ but rather to $1/(X-1)$, because then you have regular variation at infinity, which is the standard case and you can work with it much better... – Mark Oct 25 '19 at 22:12
  • @SangchulLee I updated my question with an idea of how you can possibly prove it: Can you please please please check it and tell me if it is valid? I promise it will not take more than 3 minutes... :-) – Mark Oct 25 '19 at 22:14
  • 1
    @Mark, If I did not miss anything, your proof looks valid. Good job! – Sangchul Lee Oct 26 '19 at 08:01
  • @Mark for my example the Von Mises conditions hold: we are in the Gumbel DA A similar example is obtained by taking the reversed Fréchet with $\omega = 0$. Anyway, with $\omega<\infty$, we can not be in the Fréchet DA. I think better to stick to the inverse because we keep a product or even a sum by taking logs. There are results on the tail of a sum of independent rv.s compared to that of its components. Also it could be simpler to consider tail independence (of $X$ and $X+Y$) rather than (3). – Yves Oct 26 '19 at 08:53
  • Okay: Frechet and Weibull distribution are closely connected: $\mathbb P \left(\frac{\land X-1}{b_n}\leq x\right)$ converges to $(1-\phi_{\alpha}(1/x))=(1-\gamma_{\alpha}(-x))$ with $\phi$ and $\gamma$ the cdf of the Frechet and the Weibull distribution; However, I can not see how to end up with the gumbel distribution: If you are really interested in it how about you show me how you would derive sequences $a(n)$ and $b(n)$ such that $\frac{\land X-a_n}{b_n}$ converges to a Gumbel distribution? I'd be very interested... But maybe put the answer under my question so we will not bother Sangchul – Mark Oct 26 '19 at 12:11
  • Consider $Z$ being Reversed-(standard) Fréchet i.e. $F_Z(z) = 1 - \exp{ 1 / z}$ for $z < 0$ with $\omega = 0$. Showing that $\max{Z_i} / b_n$ tends to the Gumbel distribution can be done by using the Von Mises conditions: a possible $b_n$ relates to the tail-quantile function $U(t)$ and the hazard rate $h(z)$. See my answer on CV. Also try a Google search with "reversed Fréchet". – Yves Oct 26 '19 at 15:07
  • I don't get what you are doing; If you have a r.v. $Z$ and you know that $Z$ in in the max-domain of attraction (MaxDA) of some extreme value distribution, then it does not help you to say something like $Z$ being in the MinDA of the same extreme value distribution; And of course this makes sense: $Z$ might have a very heavy tail on one side but a very weak tail on the other side; However, if $Z$ is in the MaxDA of some distribution, then we can say something about the MinDA of $1/Z$ ... – Mark Oct 29 '19 at 14:09
  • @Mark I suggested that it was better to consider a max than a min because there are much more references on this side. Of course, one side tells nothing about the other. Yet, to stick to a min: if you have a finite min then the survival does not necessarily behave as you assumed, which makes unclear what your assumptions really are. With min $=0$ instead of $=1$, you can consider the standard Fréchet, with all derivatives of the density vanishing at the min. The scaled sample min keeps far away from the lower end-point and the claimed independence is doubtful then. – Yves Nov 05 '19 at 20:00