Find a recurrence relation for the number of bit strings of length $n$ by Goulden-Jackson

Question

I am working over Goulden -Jackson Method, I tried to undergo every possible question type.

I obtained the following questions from Rosen's Discrete Mathematics and Its Applications. I solved them by classical way, but when I tried to solve them via Goulden - Jackson, I got stuck.

a-) Find a recurrence relation for the number of bit strings of length $n$ which do not contain three consecutive zeros

My work = The bad word is $000$ , but it has two overlapping such that $0,00$ . Unfortunately, I do not know how to approach when there are two different overlapping words. I got stuck there.

b-) How many bit strings of length $n$ contain either five consecutive zeros or five consecutive ones.

My work = I thought that I can reach the solution by all cases - $(00000,11111)$ are bad words. In this situation, I reached $\frac{1}{1-2x} - \frac{1}{1-2x+2x^5}$ , but when I convert it into recurrence form, it does not satisfy the desired result.

c-) How many bit strings of length $n$ contain either three consecutive zeros or four consecutive ones.

My Work = It has the same logic as part $b$

I hope to find answers for my questions. Thanks for your works.

I just learned this method and enjoy it. Thanks for sharing. Sorry I didn't help with parts b and c. — user551774, Jun 12 '21 at 01:32
The explicit list of values is https://oeis.org/A000073 , see the comment by Deutsch of 2006 in that entry. — R. J. Mathar, Dec 07 '22 at 18:25

user551774 · Answer 1 · 2021-06-12T01:34:30.637

I will answer part (a). Parts (b) and (c) should be done similarly.

Using the Goulden Jackson method you mention, the generating function is

$$f(s) = \frac{1}{1-2s-w},$$

where $w$ is the weight of the cluster of bad words. The bad word in our case is just $\{000\}$. We have that $w$ satisfies

$$w = -s^3 - (s^2+s)w$$

since there are two overlaps of $000$ with $000$, one of length one and one of length two.

Solving for $w$ gives $w = \frac{-s^3}{1+s+s^2}$. Plugging this in gives the generating function

$$\frac{1}{1-2s+\frac{s^3}{1+s+s^2}} = \frac{1+s+s^2}{1-s-s^2-s^3}$$

You can verify that the first few terms of the above are

$$1+2s+4s^2+7s^3+13s^4+24s^5+O(s^6)$$

and the first few terms can be verified.

I saw where i missed , thanks for part a . If you have anywork for part b and c . It makes me happy except for saying "by similar way :)" — Not a Salmon Fish, Jun 12 '21 at 03:57

Markus Scheuer · Accepted Answer · 2021-06-21T20:09:00.243

Here we consider the cases (b) and (c). We apply the Goulden-Jackson Cluster Method following the presentation by J. Noonan and D. Zeilberger. We start with

Case (b):

We consider the set of words of length $n\geq 0$ built from an alphabet $$\mathcal{V}=\{0,1\}$$ and the set $B=\{00000,11111\}$ of bad words. We derive a generating function $A(x)$ with the coefficient of $x^n$ being the number of words of length $n$ avoiding bad words.

Since we are looking for the number of words of length $n$ which contain either $00000$ or $11111$ a generating function $B(x)$ for the number of wanted words is \begin{align*} B(x) = \frac{1}{1-2x}-A(x)\tag{1.1} \end{align*}

According to the paper (p.7) the generating function $A(x)$ is \begin{align*} A(x)=\frac{1}{1-dx-\text{weight}(\mathcal{C})}\tag{1.2} \end{align*} with $d=|\mathcal{V}|=2$, the size of the alphabet and $\mathcal{C}$ the weight-numerator of bad words with \begin{align*} \text{weight}(\mathcal{C})=\text{weight}(\mathcal{C}[00000])+\text{weight}(\mathcal{C}[11111])\tag{1.3} \end{align*}

We calculate according to the paper \begin{align*} \text{weight}(\mathcal{C}[00000])&=-x^5-(x+x^2+x^3+x^4)\text{weight}(\mathcal{C}[00000])\\ \end{align*} and get \begin{align*} \text{weight}(\mathcal{C}[00000])&=-\frac{x^5}{1+x+x^2+x^3+x^4}\\ &=-\frac{x^5(1-x)}{1-x^5}\\ \text{weight}(\mathcal{C}[11111])&=-\frac{x^5(1-x)}{1-x^5} \end{align*} From (1.3) we obtain \begin{align*} \text{weight}(\mathcal{C})&=\text{weight}(\mathcal{C}[00000])+\text{weight}(\mathcal{C}[11111])\\ &=-\frac{2x^5(1-x)}{1-x^5} \end{align*} It follows from (1.1) and (1.2) \begin{align*} B(x)&=\frac{1}{1-2x}-\frac{1}{1-dx-\text{weight}(\mathcal{C})}\\ &=\frac{1}{1-2x}-\frac{1}{1-2x+\frac{2x^5(1-x)}{1-x^5}}\\ &=\frac{1}{1-2x}-\frac{1-x^5}{1-2x+x^5}\\ &=2x^5+6x^6+16x^7+\color{blue}{40}x^8+96x^9+\cdots\\ \end{align*}

The last line was calculated with the help of Wolfram Alpha. The coefficient of $x^{8}$ shows there are $40$ words of length $8$ which do contain either $00000$ or $11111$.

The $\color{blue}{40}$ valid words of length $8$ are: \begin{align*} \begin{array}{ccccc} 00000000&00000001&00000010&00000011&00000100\\ 00000101&00000110&00000111&00011111&00100000\\ 00111110&00111111&01000000&01000001&01011111\\ 01100000&01111100&01111101&01111110&01111111\\ 10000000&10000001&10000010&10000011&10011111\\ 10100000&10111110&10111111&11000000&11000001\\ 11011111&11100000&11111000&11111001&11111010\\ 11111011&11111100&11111101&11111110&11111111 \end{array} \end{align*}

Case (c):

We consider the set of words of length $n\geq 0$ built from an alphabet $$\mathcal{V}=\{0,1\}$$ and the set $B=\{000,1111\}$ of bad words. We derive a generating function $C(x)$ with the coefficient of $x^n$ being the number of words of length $n$ avoiding bad words.

Since we are looking for the number of words of length $n$ which contain either $000$ or $1111$ a generating function $D(x)$ for the number of wanted words is \begin{align*} D(x) = \frac{1}{1-2x}-C(x)\tag{2.1} \end{align*}

According to the paper (p.7) the generating function $C(x)$ is \begin{align*} C(x)=\frac{1}{1-dx-\text{weight}(\mathcal{C})}\tag{2.2} \end{align*} with $d=|\mathcal{V}|=2$, the size of the alphabet and $\mathcal{C}$ the weight-numerator of bad words with \begin{align*} \text{weight}(\mathcal{C})=\text{weight}(\mathcal{C}[000])+\text{weight}(\mathcal{C}[1111])\tag{2.3} \end{align*}

We calculate according to the paper \begin{align*} \text{weight}(\mathcal{C}[000])&=-x^3-(x+x^2)\text{weight}(\mathcal{C}[000])\\ \text{weight}(\mathcal{C}[1111])&=-x^3-(x+x^2+x^3)\text{weight}(\mathcal{C}[1111])\\ \end{align*} and get \begin{align*} \text{weight}(\mathcal{C}[000])&=-\frac{x^3}{1+x+x^2}\\ &=-\frac{x^3(1-x)}{1-x^3}\\ \text{weight}(\mathcal{C}[1111])&=-\frac{x^3}{1+x+x^2+x^3}\\ &=-\frac{x^4(1-x)}{1-x^4} \end{align*} From (2.3) we obtain \begin{align*} \text{weight}(\mathcal{C})&=\text{weight}(\mathcal{C}[000])+\text{weight}(\mathcal{C}[1111])\\ &=-\frac{x^3(1-x)}{1-x^3}-\frac{x^4(1-x)}{1-x^4} \end{align*} It follows from (2.1) and (2.2) \begin{align*} C(x)&=\frac{1}{1-2x}-\frac{1}{1-dx-\text{weight}(\mathcal{C})}\\ &=\frac{1}{1-2x}-\frac{1}{1-2x+\frac{x^3(1-x)}{1-x^3}+\frac{x^4(1-x)}{1-x^4}}\\ &=\frac{1}{1-2x}-\frac{1+2x+3x^3+3x^3+2x^4+x^5}{1-x^2-2x^3-2x^4-x^5}\\ &=x^3+4x^4+11x^5+28x^6+\color{blue}{65}x^7+147x^8+\cdots\\ \end{align*}

The last line was calculated with the help of Wolfram Alpha. The coefficient of $x^{7}$ shows there are $65$ words of length $7$ which do contain either $000$ or $1111$.

The $\color{blue}{65}$ valid words of length $7$ are: \begin{align*} \begin{array}{ccccc} 0000000&0000001&0000010&0000011&0000100\\ 0000101&0000110&0000111&0001000&0001001\\ 0001010&0001011&0001100&0001101&0001110\\ 0001111&0010000&0010001&0011000&0011110\\ 0011111&0100000&0100001&0100010&0100011\\ 0101000&0101111&0110000&0110001&0111000\\ 0111100&0111101&0111110&0111111&1000000\\ 1000001&1000010&1000011&1000100&1000101\\ 1000110&1000111&1001000&1001111&1010000\\ 1010001&1011000&1011110&1011111&1100000\\ 1100001&1100010&1100011&1101000&1101111\\ 1110000&1110001&1111000&1111001&1111010\\ 1111011&1111100&1111101&1111110&1111111 \end{array} \end{align*}

Note: I've written some lines of code in R in order to check the coeffcients and generate valid words.

############################################################################
#
#  MSE 4170314
#
############################################################################
#
#  generate all combinations of length "len"
#  of elements given in "v"
#
combinations <- function(v,len) {
  cur_v <- v
  if (len > 1) {
    for (i in 1:(len-1)) {
      next_v <- as.vector(outer(cur_v, v, paste, sep=""))
      cur_v <- next_v
    }
  }
  return(cur_v)
}

main part

for (n in c(1:12)) {
v <- c("0", "1")
   w <- combinations(v,n)
   w0  <- w[grepl("000",w)]
   w1  <- w[grepl("1111",w)]
   w_res <- sort(unique(c(w0,w1)))
print(paste(n,length(w_res)))
   if (n == 7) {
       print(paste(w_res, collapse = "&"))
   }
}

thank you very much Markus , i realize that i made a stupid mistake when adding generating functions to each other. By the way , when you write valid $65$ words , did you use any software ,or just by hands ? I f you use any web page or anything else , can yu share? — Not a Salmon Fish, Jun 21 '21 at 18:53
@Bulbasaur: You're welcome! I typically write a code snippet in R to generate the valid words. — Markus Scheuer, Jun 21 '21 at 19:35
what is a possible application of these kinds of problems please? — Avv, Jun 22 '21 at 19:35
@MarkusScheuer I want to ask one last thing such that what if the part c were "How many bit strings of length n do not contain either three consecutive zeros or four consecutive ones" , we would do generating functions of bad word $(000)$ + generating functions of bad word $(1111)$ - generating function of bad words $(000)$ and $(1111)$ , right ? — Not a Salmon Fish, Jun 22 '21 at 20:44
@Bulbasaur. Yeah, but we did not take it! Anyway, thank you. Could you please tell me where $D(x) = \frac{1}{1-2x}- C(x)$ came from? — Avv, Jun 22 '21 at 20:46
@Bulbasaur: I don't think this is correct, since we add the words which do not contain $000$ and which do not contain $1111$ twice, in each of the generating functions. — Markus Scheuer, Jun 22 '21 at 21:07
@Avra: The coefficient of $x^n$ in $\frac{1}{1-2x}=1+2x+4x^2+\cdots$ gives the number of all binary words of length $n$. The generating function $C(x)$ gives all the binary words avoiding subwords $000$ and $1111$. So, the difference $D(x)$ gives all binary words which contain either $000$ or $1111$. — Markus Scheuer, Jun 22 '21 at 21:16
@MarkusScheuer I think i have been misunderstood , because of my bad english . When I said "do not contain either three consecutive zeros or four consecutive ones" , it has been understood like "neither .. nor.." . My actual question was "may not contain " consecutive zeros or four consecutive ones". For example , the string can contain three consecutive zero but not four consecutive ones , he string can contain four consecutive ones but not three consecutive ones , ot do not contain both of them. — Not a Salmon Fish, Jun 23 '21 at 07:43
@MarkusScheuer LASTLY , If the part c were such that it contains three consecutive zeros but not four consecutive ones , it would (the generating function do not contain four consecutive zeros ) - ( the generating functions do not have three consecutive zeros and four consecutive ones) . Is it right ? MARKUSHELPME.COM — Not a Salmon Fish, Jun 23 '21 at 09:51
@Bulbasaur: I think, this is somewhat trickier. I propose you add a new question, so that the community can provide an appropriate answer. — Markus Scheuer, Jun 23 '21 at 17:22
@MarkusScheuer I found exactly what i meant , and i added a solution , https://math.stackexchange.com/questions/4056459/how-many-length-n-bitstrings-containing-3-consecutive-0s-and-4-consecuti/4191721#4191721 . It does not seem tricky — Not a Salmon Fish, Jul 06 '21 at 13:11

user551774 · Answer 3 · 2021-06-12T03:52:39.807

0

Not using the Goulden Jackson Method:

For part (a), let $a_n$ be the number of bit strings of length $n$ with no three consecutive $0$s. Assume $n\geq 4$ (for $n=1,2,3$, we'll need to calculate manually to find the initial conditions). By the sum rule, we split the cases into three possibilities for the starting bit(s).

$$a)\ 1..$$ $$b)\ 01..$$ $$c)\ 00..$$ $$$$

In the first case, there are $a_{n-1}$ bit strings of length $n$ without $000$ that start with a $1$ (just append it to the $1$). For the second case, $a_{n-2}$.For the third case, the next bit must be a $1$, so there are $a_{n-3}$ ways there. The answer is thus

$$a_n = a_{n-1}+a_{n-2}+a_{n-3},$$

with initial conditions $a_1 = 2,a_2 = 4,a_3 = 7$.

edited Jun 12 '21 at 03:52

answered Jun 12 '21 at 01:31

user551774

186

Unfortunately , your answer is wrong , the true answer is $a_n=a_{n-1}+a_{n-2}+a_{n-3}$ by using classical methods – Not a Salmon Fish Jun 12 '21 at 03:05
My solution is right, but yours is simpler. I will fix it thanks. – user551774 Jun 12 '21 at 03:48
it is not important, thanks , as i mentioned i solved them classical way like you did. However if you any suggestion for part b and c . I will be happy – Not a Salmon Fish Jun 12 '21 at 03:50

Find a recurrence relation for the number of bit strings of length $n$ by Goulden-Jackson

3 Answers3

main part

Linked