1

I'm working on a programming question that I believe can be solved with combinatorics, but my combinatorics answer appears to be incorrect.

The question is

An attendance record for a student can be represented as a sequence of characters where each character signifies whether the student was absent (A), late (L), or present (P) on that day. The record only contains the following three characters: A, L, P. Any student is eligible for an attendance award if they meet both of the following criteria: (1) The student was absent ('A') for strictly fewer than 2 days total. (2) The student was never late ('L') for 3 or more consecutive days.

For a given $N$ days, how many ways are there for a student to receive a reward?

My approach is that to recognize that we have $n$ position to fill. There are several disjoint cases that satisfy the constraints:

(1) exactly 1 A and 0 L; $\binom{n}{n-1}$ sequences

(2) exactly 1 A and 1 L; $\binom{n}{n-2} * 2!$ sequences

(3) exactly 1 A and 2 L; $\binom{n}{n-3} * 3! / 2!$ sequences

(4) exactly 0 A and 0 L; $1$ sequence

(5) exactly 0 A and 1 L; $n$ sequences

(6) exactly 0 A and 2 L; $\binom{n}{2}$ sequences

For $n = 4$, I obtain that the result is $39$, however, the solution is apparently $43$.

I enumerated all the possible sequences:

(1) APPP, PAPP, PPAP, PPPA

(2) ALPP, LAPP, APLP, LPAP, PALP, PLAP, APPL, LPPA, PAPL, PLPA, PPAL, PPLA

(3) ALLP, LALP, LLAP, ALPL, LAPL, LLPA, APLL, LPAL, LPLA, PALL, PLAL, PLLA

(4) PPPP

(5) LPPP, PLPP, PPLP, PPPL

(6) LLPP, LPLP, PLLP, LPPL, PLPL, PPLL

24n8
  • 1,523
  • 2
    In general the problem of your approach is, that you don't take into account that the person could be late for 3 days as well, if they're not consecutive. Therefore $LPLL,LLPL,LLAL,LALL$ are the four remaining possibilities. Tracking these kind of combinations for some arbitrary $N$ is also not easy – LegNaiB Jun 04 '21 at 22:10
  • @LegNaiB Ahh I overlooked the "consecutive" qualifier. I think that makes this problem a lot more difficult analytically? Can a closed form solution be obtained? – 24n8 Jun 04 '21 at 22:11
  • I suggest temporarily ignoring the absentee issue and using, as a starting point, recursion to identify the total number of sequences of length $n$ that do not contain 3 consecutive (L)'s. A satisfying sequence of length $n$ either ends in two (L)'s, or it doesn't. If it does end in two (L)'s, then the (n+1) character can not be an (L). If it does not end in two consecutive (L)'s, then the (n+1) character can be anything. You will have to keep track of how many satisfying sequences there are, as well as the count of the subset of such sequences that end in (LL). – user2661923 Jun 04 '21 at 22:18
  • I think you should ignore the absences at first. Let $a_{n,k}$ we the number of $n$-character strings of P's and L's with $k$ L's and no $3$ consecutive L's. If you can compute that, you should be well on your way, because there are then $(n-k+1)a_{n,k}$ strings with $k$ L and at most one A. – saulspatz Jun 04 '21 at 22:21
  • Even if the above approach works (I am referring to my previous comment, not saulspatz' comment), the difficulty gets compounded by having to re-consider how many such sequences there are with exactly $0$ or $1$ absences. – user2661923 Jun 04 '21 at 22:21
  • I (also) regard the comment of @saulspatz as indicating a method preferable to mine, if it is do-able. This is because, if you know the number of satisfying sequences, but these sequences have a variable number of (L)'s, then determining how many ways there are of converting one of the (P)'s to an A may be problematic. – user2661923 Jun 04 '21 at 22:27
  • A pseudo-alternative approach is to use a computer program (e.g. in Java, C, python, ...) to simulate the number of satisfying sequences, as a function of $n$, for $n \in {3,4,5, \cdots, 10}.$ The idea would be to look for a pattern in the data, use the pattern to construct a hypothesis, sanity-check the hypothesis against (for example) $n \in {11,12,13,14}$, and then analytically prove the hypothesis. – user2661923 Jun 04 '21 at 22:32
  • @user2661923 I don't know if it's doable or not. I would split $a_{n,k}$ into $3$ cases: the last character is P, the last two character are PL, the last two character are LL, and then write down the recurrences. That would, at least make it easy to extend the computer program to large values of $n$. – saulspatz Jun 04 '21 at 22:37
  • At the risk of beating a dead horse, a hybrid approach is plausible. The difficulty with the computer programming approach is in using the data to find a pattern, and then constructing a hypothesis from this pattern. The hybrid approach would be to combine the computer programming approach $\to f(n)$, with the [assume $0$ absences] analytical-recursion approach $\to g(n)$, and then attempting to analytically compare $f(n)$ with $g(n)$. – user2661923 Jun 04 '21 at 22:54
  • One can solve the problem for no three consecutive absences analytically, but it's unpleasant. The recursion is $a_n=a_{n-1}+a_{n-2}+a_{n-3}$ and the roots are complicated. I haven't tried to figure out the actual formula, because I'm sure one would end up using the recurrence anyway. I think this is a programming problem, and that my first suggestion, implemented on a computer, is the way to go. – saulspatz Jun 04 '21 at 23:10

2 Answers2

3

As I said in a comment, I think the best way to do this is with recursion, ignoring the absences, at first.

Let $a_{n,k}$ be the number of admissible binary strings, that is, binary strings with exactly $k$ $1$'s, but without $3$ consecutive $1$'s. We may write $$a_{n,k}=b_{n,k}+c_{n,k}+d_{n,k},$$ where $b_{n,k}$ is the number of admissible $n$-bit strings that end in $0$, $c_{n,k}$ the number that end in $01$, and $d_{n,k}$ the number that end in $011$.

Now, $c_{n,k}=b_{n-1,k-1}$ and $d_{n,k}=b_{n-2,k-2}$ so we just need a formula for $b_{n,k}$. Obviously, we can add a $0$ to any admissible string and get another admissible string with the same number of $1$s so $$\begin{align} b_{n,k}&=b_{n-1,k}+c_{n-1,k}+d_{n-1,k}\\ &=b_{n-1,k}+b_{n-2,k-1}++b_{n-3,k-2} \end{align}$$
and we have our recurrence.

Once we know $a_{n,k}$ it's easy to allow for absences. Either there are no absences, or once of the $n-k$ days when the student wasn't late is replaced by an absence. That is, the number of possible ways to gain an award is $$\sum_{k=0}^n(n-k+1)a_{n,k}$$

I wrote a python script to implement this, and it will instantaneously compute that there are $20795180176044632893334206971$ ways to win an award if the school year is $100$ days.

As you might expect, the values are roughly exponential. I plotted the logarithms of the values for $0\leq n\leq100$ and this is what I got:

enter image description here

EDIT

I've realized it is possible, at least in theory, to get an explicit formula. Let $p_n$ be the number of admissible strings with no absences. Such a string ends in P, or in PL, or in PLL, so we have $$p_n=p_{n-1}+p_{n-2}+p_{n-3},\ n\geq3$$ and $p_0=1,\ p_1=2,\ p_2=4$. With the aid of a CAS, we can get an explicit formula for $p_n$ by the usual method, although the formula is much too complicated to reproduce here.

Now let $a_n$ be the number of admissible string with at most one absence. Then $$a_n=p_n+\sum_{k=0}^{n-1}p_kp_{n-k-1}\tag1$$ because either there are no absences or a single absence with $k$ days before it and $n-k-1$ days after it, with no absences and no $3$ consecutive late days.

Given an explicit formula for $p_n$, $(1)$ is the explicit formula alluded to above. I implemented $(1)$ is sympy, but it's impossible slow. There is a noticeable delay before it computes the value for $n=4$. I also tried computing the roots and coefficients numerically. This gives much faster computation, of course, but the results are not completely accurate. My script gives the right answers for $0\leq n\leq 52$ but after that, floating point errors creep in. The relative error is very small, at least for $n\leq 100$, though the absolute error becomes large.

My original approach seems better to me.

saulspatz
  • 53,824
  • @pilgrim Sorry, I had written about it in a comment, and I must have thought I wrote about it in the answer, too. I'll fix it. – saulspatz Jun 05 '21 at 14:28
0

This is a job for inclusion and exclusion: If the student has 2 or more 'A' (any position) or 3 or more consecutive 'L' (once or more times) they don't get the award. Then you need to compute how many are in both situations, and from this get the number of ways of not getting the award. It gets messy, but doable.

vonbrand
  • 28,394