1

I am currently using an ILP to model events which occur on some input sequence from $1...n$. These events modify the input sequence in order to obtain a desired sequence. Each event can happen on some consecutive range $(i,j)$ and no two events can overlap. Therefore, I can represent events by some matrix $a=\{0, 1\}^{n\times n}$ where $a_{i,j} = 1$ iff there is an event that begins at $i$ and ends at $j$ ($i<=j$).

Unfortunately, this results in $O(n^2)$ variables and in some cases $O(n^3)$ or greater number of constraints. It would be ideal if instead I could represent these events as a row $a'=\{0,1\}^n$, where $a'_i = 1 \iff \exists a_{p,q}\quad \text{s.t.}\quad p\leq i\leq q \land a_{p,q} = 1$
i.e. a position $a'_i=1$ iff there is an event that modifies position $i$.

Since I am optimizing over the number of events, at the moment I am minimizing $\sum a_{p,q}$, since it represents the number of events. If I ditch $a$ and try to only use $a'$, I am left with some row, for example $a'=[1,1,0,...,1,0,1]$.

Question 1: I want to represent the total number of events given $a'$. Currently, I am thinking $\sum a_i(1-a_{i+1})$ to count the number of "ends" of events. Obviously I can deal with the edge case as well. Unfortunately, this results in a product of binary variables. I know that I can represent this with a helper variable, and just sum over that, but I was wondering if there was any more efficient ways.

Question 2: This question is more important, since my whole model revolves around this. An event will only affect the input elements after the end of the event. Think of events as duplications. If some range $(i,j)$ is duplicated, all elements after $j$ would shift to the right by $j-i+1$ slots. Following this idea of events being duplications, consider the input below:

$a,b,c,d,e$ and the desired sequence $a,b,a,b,c,d,e,d,e$. We can achieve this by duplicating $a,b$ and $d,e$ i.e. $a'=[1,1,0,1,1]$. Notice how positions $0$ and $1$ did not shift at all, positions $2,3,4,5,6$ were originally $0,1,2,3,4$, and positions $7$ and $8$ were originally $3$ and $4$.

Essentially, for every position $i$ in the resulting sequence, I would like to be able to determine which position $j$ it came from in the original sequence. Consider position $6$ in the resulting sequence, we can clearly see it came from position $4$. We notice that this shift of $2$. Similarly, position $5$ came from position $3$. A pattern unfolds such that the shift of position $i$ is always equal to the $\sum_{j<k} a'_j$ where $k$ is the largest index such that $k\leq i \land a_k=0$. This is the same as saying that we are taking the sum of all values of $a'$ up to the most recent $0$ before or at $i$ in $a'$.

How can I model such shifting in an ILP? I.e. given $a’$, how can I create a variable $h$ in the ILP such that $h_{r, l}=1$ iff $\sum_{j<k} a'_j= l$ where $k$ is the largest index such that $k\leq r \land a_k=0$.

Throckmorton
  • 1,039
  • 1
  • 8
  • 21

1 Answers1

1

I think this can be done with $9n$ variables, instead of $n^2$. Define the following integer variables for use in your linear program:

  • $a_i = 1$ if index $i$ is within an event
  • $s_i = 1$ if an event starts at index $i$
  • $e_i = 1$ if an event ends at index $i$
  • $c_i = $ the number of positions until the end of this event (so that index $i+c_i-1$ is the end of the event containing $i$), if $i$ is in an event, or $0$ if $i$ is not in an event
  • $d_i = $ is how many positions since the start of the event (so that index $i-d_i+1$ is the start of the event containing $i$), if $i$ is in an event, or $0$ if $i$ is not in an event
  • $\ell^1_i = $ the length of the event containing $i$, if $i$ is the end of an event, or $0$ if $i$ is not the end of an event
  • $\ell^2_i = $ the length of the event containing $i$, if $i$ is the start of an event, or $0$ if $i$ is not the start of an event
  • $o^1_i = $ the amount index $i$ is shifted when copying to the new array, for the first copy of that element
  • $o^2_i = $ the amount index $i$ is shifted when copying to the new array, for the second copy of that element

These can be enforced with the following equations:

  • $s_i = a_i \land \neg a_{i-1}$ (see Express boolean logic operations in zero-one integer linear programming (ILP) for how to enforce this)

  • $e_i = a_i \land \neg a_{i+1}$

  • $c_i \ge c_{i+1} + 1 - n s_{i-1}$, $c_i \le c_{i+1} + 1$, $c_i \le n a_i$, $c_i \ge e_i$, $c_i \ge 0$

  • $d_i \ge d_{i-1} + 1 - n e_{i-1}$, $d_i \le d_{i-1} + 1$, $d_i \le n a_i$, $d_i \ge s_i$, $d_i \ge 0$

  • $\ell^1_i \le n e_i$, $\ell^1_i \ge d_i - n(1-e_i)$, $\ell^1_i \ge 0$

  • $\ell^2_i \le n s_i$, $\ell^2_i \ge c_i - n (1-s_i)$, $\ell^2_i \ge 0$

  • $o^1_i = o^1_{i-1} + \ell^1_{i-1}$, $o^1_0 = 0$

  • $o^2_i = o^2_{i-1} + \ell^2_i$, $o^2_0 = 0$

Then the total number of events is $s_1+\dots+s_n$, and you can model shifting as follows: index $i$ in the original array is shifted to indices $i+o^1_i$ and $i+o^2_i$ in the new array.

Do check my work carefully. There could easily be off-by-one errors or other mistakes in the above.

Throckmorton
  • 1,039
  • 1
  • 8
  • 21
D.W.
  • 167,959
  • 22
  • 232
  • 500