10

UPDATE. The case of geometric distributions seems to be solved: I think we have that for all $\lambda \in (0,1)$, $\mathbb{P}_{p_\lambda}(D(\mathbb{N})) = 0$. But the fundamental question that remains open is: does there exist a probability mass function $p$ of full support, such that $\mathbb{P}_p(D(\mathbb{N})) \neq 0$?


[In this post, $\mathbb{N}$ excludes $0$.]

Given a probability mass function $p$ on $\mathbb{N}$ of full support [i.e. $p(a)>0$ for all $a \in \mathbb{N}$], define the probability measure $\mathbb{P}_p$ on $\mathrm{Sym}(\mathbb{N})$ [the set of permutations of $\mathbb{N}$] by $$ \mathbb{P}_p\big( \pi \, : \, \pi(i) = a_i \ \ \forall \, i \in \{1,\ldots,n\} \big) \ = \begin{cases} \prod_{i=1}^n \frac{p(a_i)}{1 - \sum_{j=1}^{i-1}p(a_j)} & \text{if } a_1,\ldots,a_n \text{ are distinct} \\ 0 & \text{otherwise} \end{cases} $$ for all $n \in \mathbb{N}$.

[To justify that this is well-defined: the Kolmogorov extension theorem gives that this describes an existing and unique probability measure on the set of all functions from $\mathbb{N}$ to $\mathbb{N}$, and it is clear that under this probability measure, almost every function is injective; and we also have that almost every function is surjective, since for each $a \in \mathbb{N}$, the probability that $a \not\in \pi(\{1,\ldots,n\})$ is less than $(1-p(a))^n$, which tends to $0$ as $n \to \infty$.]

Now let $D(\mathbb{N}) \subset \mathrm{Sym}(\mathbb{N})$ be the set of derangements of $\mathbb{N}$, i.e. the set of permutations with no fixed point.

In general, I'm interested to know what we can say about $\mathbb{P}_p(D(\mathbb{N}))$. I'm guessing there exist probability mass functions $p$ for which $\mathbb{P}_p(D(\mathbb{N})) \neq 0$? In particular, if we consider geometric distributions $$ p_\lambda(a) = (1-\lambda)\lambda^{a-1} $$ for $\lambda \in (0,1)$, do we have that $\mathbb{P}_{p_\lambda}(D(\mathbb{N})) = 0$, and if not, then is there a nice formula for $\mathbb{P}_{p_\lambda}(D(\mathbb{N}))$, and what does $\mathbb{P}_{p_\lambda}(D(\mathbb{N}))$ tend towards as $\lambda$ tends to the extreme values of $0$ and $1$?

  • 1
    You don't really need to define the probability of an empty set as $0.$ – Thomas Andrews May 18 '25 at 01:39
  • @ThomasAndrews Haha, good point - I did it to try to aid the clarity of what's going on in my construction. – Julian Newman May 18 '25 at 09:27
  • The order you choose terms doesn't matter, so keep drawing until you get $1$ (this will happen in finite time with probability $1$) and set those terms in a cycle. For example, you might draw $9,2,5,1$. Then set $a_1=9$, $a_9=2$, $a_2=5$, $a_5=1$. Once you get to this point, you have used up $1,2,5,9$, and then you are back to needing a derangement from $\mathbb{N}\setminus{1,2,5,9}$. You fail whenever, at the beginning of a cycle, you draw that same number. In the previous example, this would be if you draw $a_3=3$ next. I think the probability of getting a derangement is then always $0$. – Varun Vejalla May 23 '25 at 18:21
  • 2
    Unfortunately I don't have time to write a proper answer, but yes, you can actually arrange for any specific permutation to have positive probability. Whenever $(\sum_{j>i}p_j)/p_i$ is summable over $i$, the identity will have positive probability. For example $p(i)=2^{-2^i}$. Permuting the probabilities, you can get positive probability for any permutation you like (for example, a derangement). – Nicholas Burbank May 27 '25 at 10:06
  • By the way, I posted a follow-up question on MathOverflow: https://mathoverflow.net/q/495255/15570 – Julian Newman May 27 '25 at 13:27

1 Answers1

1

The order we fill in the permutation doesn't matter. That is, $$\mathbb{P}_p\big( \pi \, : \, \pi(b_i) = a_i \ \ \forall \, i \in \{1,\ldots,n\} \big) \ = \prod_{i=1}^n \frac{p(a_i)}{1 - \sum_{j=1}^{i-1}p(a_j)}$$

when the $a_i$ are distinct and the $b_i$ are distinct (but the $a_i$ and $b_i$ can have overlap between each other). The intuition behind this probability is that we get an element of $\mathbb{N}$ with probability according to $p$, and then this removes the total probability in the denominator for other terms in the sequence.

What we will do is choose what terms to fill in according to the terms we have already filled in. We start off with $b_1=1$. If $a_1:=\pi(b_1)=1$, then it's already not a derangement. Otherwise, we set $b_2=a_1$ so that we set that term next. Until we get $b_1$, we set $b_{n+1}=a_n$. If we get $a_1$ (which we will get at some finite point with probability $1$), then this cycle is done. After this point, we are left with needing a derangement out of all the numbers we haven't drawn yet.

As an example, let's say the sequence goes $9, 2, 5, 1$. Then $a_1=9$, $a_9=2$, $a_{2}=5$, and $a_5=1$. We have used up $\{1,2,5,9\}$ both as terms and as indices, so we then need a derangement of $\mathbb{N}\setminus\{1,2,5,9\}$. We would draw $a_3$ next, and the new cycle would end when we draw $3$. We fail when we, at the beginning of a new cycle, pick the number that is the same as the index (in the previous example, if we draw $a_3=3$, it won't be a derangement).

Suppose some distribution $p$ satisfies $\mathbb{P}_p(D(\mathbb{N}))=q\not=0$. Consider the finite subsets $T\subset\mathbb{N}$, and let $r=\sup\{\mathbb{P}_p(D(\mathbb{N}\setminus T))\}\ge q$. We then need for all $\epsilon>0$ for there to exist a subset $T$ of $\mathbb{N}$ such that $$r(1-\epsilon)<\mathbb{P}_p(D(\mathbb{N}\setminus T))\le \left(1-\frac{p(n)}{1-p(T)}\right)r$$

holds for all $n\in\mathbb{N}\setminus T$. Note that it needs to hold for all $n$, not just one, because we can choose any remaining element to start the next cycle from.

If we rearrange this inequality a bit, we see that $$\frac{p(n)}{1-p(T)}<\epsilon$$

We decide $T$ greedily and start it off as the empty set. If $p(n)<\epsilon(1-p(T))$ holds for all $n\notin T$, we are done. Otherwise, we add the values with probabilities exceeding that threshold to $T$ and repeat with the updated $T$. Let's sort $\mathbb{N}$ according to $p(n)$, and WLOG, let's assume the sorting is just $1,2,3,\cdots$ so that $p(1)\ge p(2)\ge...$.

We then need that for every $\epsilon>0$, there is some $n$ such that $$\frac{p(n+1)}{1-p([n])}<\epsilon$$

This condition fails for the geometric distributions, but it does work for distributions of the form $p(n)\propto n^{-c}$ for $c>1$. That doesn't necessarily mean the probability for these distributions is nonzero though, just that this proof isn't enough for those ones.

  • Thank you. Unless I've misunderstood you, there is no "maximum probability $q$ of drawing a good cycle" - the supremum is an unattained supremum of $1$. But if you only had in mind the exponential case, then I suspect your solution is valid: the supremum over the set of conditions distributions after starting at $p_\lambda$ is $\lambda$. (Obviously I haven't phrased this rigorously, but I think I should be able to work out how to do so.) – Julian Newman May 23 '25 at 20:10
  • @JulianNewman That's a good point. I've edited the second half of my answer substantially and came up with a necessary condition on $p$ for the probability to be nonzero. – Varun Vejalla May 24 '25 at 06:38
  • 1
    I don't think your first equation is correct. Taking $n=1$, we get $\mathbb P( \pi(b)=a)= p(a)$ for all $a,b$. But this is impossible as for fixed $a$ and varying $b$ these events are disjoint, so we get infinitely many disjoint events with identical nonzero probability, which is impossible. – Will Sawin May 27 '25 at 16:07
  • @WillSawin Yes, I started thinking about this more carefully after I started reading your reply to my question, and I realise that the first equation can't be true! However, I have a feeling that the disjoint-cycles approach will still give a simple argument that $\mathbb{P}{p\lambda}(D(\mathbb{N}))=0$. – Julian Newman May 27 '25 at 17:40
  • @WillSawin I should've caught that earlier! I'll try to salvage it later. Hopefully, I can fix that part while keeping the rest. – Varun Vejalla May 28 '25 at 05:46