6

Suppose there is a sentence containing only sequences of three characters and nothing more. The three characters are $X,Y,Z$ and it is given that $X$ has occurred $a$ times, $Y$ has occurred $b$ times and $Z$ has occurred $c$ times in the sentence. What is the probability that a $X$ will be followed by a $Y$ at least $2$ times$?$

This problem looks a bit complicated to me, so I decided to break it into some parts. At least $2$ times means, all cases$-$$($exactly $0$ time$+$exactly $1$ time$)$

The number of all cases are simply $$\binom{a+b+c}{a}\binom{b+c}{b}\binom{c}{c}$$ For the exactly $0$ times I'm able to think of a logic but it is hard to explain. I think that we should first select a $Z$ and fix it. Then we arrange $b$ $Y's$ and $(c-1)$ $Z's$ on one side of the fixed $Z$ and on other side put all the $X$. One more is to put all the $X's$ between two $Z's$ and then do the rest of the arrangement. So you see I'm not been able to think of all such cases. One more is put all $Y's$ and then arrange rest such that no $X$ goes behind any $Y$.

For exactly $1$ I have similar incomplete cases. How to find all possible cases in each sub problem$?$

Any help is greatly appreciated.

  • Even the simpler problem of counting those words with no $XY$ pairs seems tricky. There's a recursion, as such a word must begin with $Y,Z$ or with $X^kZ$ for some $k$. And at least that allows you to compute a lot of cases. Maybe there's a closed form solution? At least, I'd start there. – lulu Mar 01 '23 at 12:45
  • @lulu i have an idea...we should solve the problem for general $k$ what is the probability that a $X$ will be followed by a $Y$ $k$ times and then we should put $0$ and $1$...but that's beyond me and idk if this idea works – Blue Cat Blues Mar 01 '23 at 13:08
  • Is there any indication of inequalities, eg $a<b, b>c$ or whatever ? – true blue anil Mar 01 '23 at 13:09
  • @trueblueanil nothing – Blue Cat Blues Mar 01 '23 at 13:10
  • You could get an approximation by working out the mean number of $XY$ pairs (easy) and their variance (slightly harder but not too bad). But you are dealing with a tail here...the mean, after all is $a\times \frac b{a+b+c-1}$ which is generally not near $0$, so it's a tough appproximation to trust. – lulu Mar 01 '23 at 13:12

2 Answers2

3

As this answer has now been accepted, I should mention that it’s unnecessarily complicated and the stars-and-bars argument provided by Daniel Mathias in a comment below is much more elegant.


You can proceed as in this nice answer by A.J. to What is the probability of 2 named cards appearing sequentially in a randomly shuffled deck if suits are ignored?.

First, to find the probability that no $X$ is followed by a $Y$, arrange the $X$s and $Z$s in some way and then insert the $Y$s. For the first insertion, there are $a+c+1$ equiprobable slots where the $Y$ can go, and $a$ of them are behind an $X$. If we ever put a $Y$ behind an $X$, it’s over (since we can’t prevent that $X$ from being followed by a $Y$ by inserting further $Y$s behind it). The probability to survive the first insertion is $\frac{c+1}{a+c+1}$, to survive the second insertion $\frac{c+2}{a+c+2}$, and so on. Thus, the probability that no $X$ is followed by a $Y$ is

$$ \frac{c+1}{a+c+1}\cdot\frac{c+2}{a+c+2}\cdots\frac{c+b}{a+c+b}=\frac{(c+b)!(a+c)!}{c!(a+c+b)!}\;. $$

For the probability that exactly one $X$ is followed by a $Y$, we need to replace one of the factors by its complement, and the numerator of all factors after that is increased by $1$ because we can now put $Y$s behind the $X$ that we put a $Y$ behind. Thus, we get an additional factor $c+b+1$ in the numerator, and if the $k$-th $Y$ is inserted after an $X$ the factors $c+k$ and $c+k+1$ are replaced by $a$. Thus the probability that exactly one $X$ is followed by a $Y$ is

$$ \frac{(c+b+1)!(a+c)!}{c!(a+c+b)!}\cdot a\left(\frac1{(c+1)(c+2)}+\frac1{(c+2)(c+3)}+\cdots+\frac1{(c+b)(c+b+1)}\right)\;. $$

The sum in parentheses telescopes because

$$\frac1{(c+k)(c+k+1)}=\frac1{c+k}-\frac1{c+k+1}\;,$$

so the sum is $\frac1{c+1}-\frac1{c+b+1}=\frac b{(c+1)(c+b+1)}$, for a probability

$$ ab\cdot\frac{(c+b)!(a+c)!}{(c+1)!(a+c+b)!}\;. $$

Subtracting these two from $1$ yields the probability that at least two $X$s are followed by a $Y$ as

$$ 1-\frac{(c+b)!(a+c)!}{c!(a+c+b)!}\left(1+\frac{ab}{c+1}\right)\;. $$

joriki
  • 242,601
  • @trueblueanil: See my comments there. – joriki Mar 01 '23 at 18:08
  • For the number of sequences with no X followed by Y, an application of stars and bars gives $\binom{a+c}{a}\binom{b+c}{b}$ in agreement with your first probability. Similarly, the number of sequences with exactly one X followed by Y is $a\binom{a+c}{a}\binom{b+c}{b-1}$. – Daniel Mathias Mar 01 '23 at 20:54
  • @DanielMathias: Thanks. There was an error in my answer – it now agrees with your result, but the derivation is much more complicated. I'd be interested in the stars and bars argument – I tried to find one but couldn't. – joriki Mar 01 '23 at 22:17
  • The $Z$s create $c+1$ bins into which we must place the $X$s and $Y$s. When we allow one $X$ to be followed by a $Y$, we then have $c+2$ bins for the remaining $b-1$ $Y$s. – Daniel Mathias Mar 02 '23 at 00:18
  • @DanielMathias: Nice, thank you. I added a note at the top of the answer to point to your more elegant solution. – joriki Mar 02 '23 at 02:44
  • @DanielMathias: I am a bit puzzled. Can you please work out how your formula and Joriki's match for the $0$ adjacent case for $a,b,c = 6,7,8 ? – true blue anil Mar 02 '23 at 06:11
  • @trueblueanil: Those are counts, not probabilities; you need to divide them by $\binom{a+b+c}{a,b,c}$ to get the probabilities. – joriki Mar 02 '23 at 06:31
  • @Joriki: Of course ! Thanks ! – true blue anil Mar 02 '23 at 07:30
2

Reworked Answer

$\mathtt{Total\; arrangements:}$

$\dbinom{a+b+c}{a,b,c}$

$\mathtt{Arrangements\; where \; no\; b\; follows\; an\; a:}$

$\dbinom{a+c}{a}\dbinom{b+c}b$

[A line of $a's$ and $c's$ with $(c+1)$ "bins" being created, $c$ to the right of each $c$ plus $1$ at the start of the line]

$\mathtt{Arrangements\; where\; one\; b\; follows\; an\; a:}$

$a\dbinom{a+c}a\dbinom{b+c}{b-1}$

[ A $b$ is attached to the right of one $a$, # of bins remain unchanged,$\; (b-1)$ to be placed ]

Putting the pieces together, P(at least $2b's$ follow an $a$)

$= 1 - {\dbinom{a+c}a\dbinom{b+c}b +\dbinom{a+c}a\dbinom{b+c}{b-1}\over \dbinom{a+b+c}{a,b,c}}$

$=1-{\dbinom{b+c}b+\dbinom{b+c}{b-1}\over\dbinom{a+b+c}{b}}$

  • Doesn't this double count (and more)? If your word had exactly three incidents of $XY$, wouldn't you count it $3$ times? – lulu Mar 01 '23 at 12:42
  • @lulu: Doesn't it count all sequences with at least two of $XY$ ? – true blue anil Mar 01 '23 at 12:50
  • It overcounts. Say the string was $XYXYXY$ ... which of those are the two $XY$ pairs you inserted specially? – lulu Mar 01 '23 at 12:52
  • Why does it have to be the first $Y$ that's placed behind an $X$? – joriki Mar 01 '23 at 17:36
  • @Joriki: I am placing the $Y's$ turn by turn, all have to be behind $X's$ for none to be after an $X$ for the first case. – true blue anil Mar 01 '23 at 17:57
  • It seems we're using "after" and "behind" differently – I was using them synonymously. I was referring to the second case of exactly one $X$ being followed by a $Y$. Why does it have to be the first $Y$ that's placed in one of the $6$ ways that make it follow an $X$? (See my answer for how I think this needs to be treated, with each possible point at which a $Y$ is inserted such that it follows an $X$ leading to a different term.) – joriki Mar 01 '23 at 18:07
  • @DanielMathias: Did you get the first part $\binom{a+c}{a}\binom{b+c}{b}$ also through stars and bars ? If so, what is the argument ? – true blue anil Oct 06 '23 at 18:22
  • Yes, as explained in my comments under joriki's answer. – Daniel Mathias Oct 09 '23 at 13:38
  • @DanielMathias: That was for the second part, I am asking about the first part. – true blue anil Oct 09 '23 at 14:12
  • The $Z$s create $c+1$ bins into which we must place the $X$s and $Y$s. This is the first part. Any bin with both has the $Y$s before the $X$s. – Daniel Mathias Oct 09 '23 at 15:29
  • Not really "standard" stars and bars. – true blue anil Oct 09 '23 at 19:35