I'm working on a Mixed-Integer-Programing (MIP) formulation for a flow shop scheduling problem. One of the requirements/wishes is that for each machine $i$, processing should be contiguous, or at least contiguous to a certain extent. What I mean with contiguous processing is visualized below for a single machine:
High contiguity :
Low contiguity: 
Some important details of this problem:
- Not all jobs visit all machines. For each machine $i$, the jobs that visit it are a member of the set $\mathcal{V}_i$.
- Operations are not allowed to overlap on a machine.
I would like to influence the contiguity somehow by including it as a part of my cost function. My approach so far was:
- For each operation $(i,j)$ of job $j$ on machine $i$, I do have a decision variable $c_{i,j}$ for its completion time. What's more, binary decision variable $x_{i,j,k}$ equals 1 IFF job $j$ precedes job $k$ on machine $i$ (not necessarily directly).
- I introduce a continuous variable $\gamma_{i,j,k}$ that captures the difference between the completion time of operation $(i,j)$ and the start of operation $(i,k)$. That is $$\gamma_{i,j,k} = \quad (c_{i,k} - p_{i,k}) - c_{i,j}\quad \forall i; j\neq k\in\mathcal{V}_i\times \mathcal{V}_i,$$ where $p_{i,k}$ is the processing time of operation $(i,k)$ so that the first term denotes the corresponding starting time. Note that $\gamma_{i,j,k}$ is negative when $k$ precedes $j$ on machine $i$.
- I introduce a binary variable $b_{i,j,k}$ that equals 1 IFF $\gamma_{i,j,k}\leq 0$. I model this via big-M formulations with sufficiently large constant $Q$: $$Q\cdot\big(1-b_{i,j,k}\big)-\gamma_{i,j,k} \geq \quad 0 \quad \forall i; j\neq k\in\mathcal{V}_i\times \mathcal{V}_i$$ $$Q\cdot\big(1-b_{i,j,k}\big)-\gamma_{i,j,k} \leq Q-\epsilon \quad \forall i; j\neq k\in\mathcal{V}_i\times \mathcal{V}_i$$
- I then add the following term to my cost function over which I minimize: $$-\sum_i\sum_{j\neq k\in\mathcal{V}_i\times \mathcal{V}_i} x_{i,j,k}\cdot b_{i,j,k}.$$ By multiplying $b_{i,j,k}$ with $x_{i,j,k}$ I make sure that only those $b_{i,j,k}$ values contribute where job $j$ actually precedes job $k$ on machine $i$. In other words, for each machine $i$, for each pair of jobs $j,k$ ($j$ precedes $k$ on machine $i$), I look at the time inbetween their completion and start and add $1$ as penalty whenever this time is positive, that is, whenever these jobs do not start immediately after each other.
Note: whenever I use the indices $j,k$ simultaneously, I assume $j\neq k$.
Question:
Not sure if I can ask this here, but I was wondering if I could get some feedback on this. I observe that including this in my cost function dramatically slows up the solving time. Maybe someone has an idea for improvement, or a whole different approach to obtain the same?
Edit:
One issue here is that I do not know which jobs directly succeed each other. If I knew, it would be much easier to solve the problem. Then I would only have to iterate over all pairs of directly consecutive jobs, instead of all possible pairs for every machine. However, given the mentioned decision variables, I don't see how I can determine this.