Mixed integer model predictive control for exploration and exploitation planning

Question

I have a dynamical system $\mathbf{x}_{k+1}=\mathbf{f}(\mathbf{x}_k,\mathbf{u}_k)$ tracking some pre-computed trajectory, $\mathbf{x}_t = (\mathbf{x}_{t,1},\cdots,\mathbf{x}_{t,K})$. Suppose we just have a simple kinematic model with states $\mathbf{x}=\begin{bmatrix}x & y & v & \theta & \omega\end{bmatrix}^T$ where $\mathbf{p} = \begin{bmatrix}x & y\end{bmatrix}^T$ are the coordinates, $v$ is the velocity, $\theta$ is the heading and $\omega$ is the angular rate. We control the acceleration and the angular rate acceleration.

At some time step $k_m$ the system is supposed to visit a view-point $\mathbf{p}_l = \begin{bmatrix}x_l & y_l\end{bmatrix}^T$ but return to the pre-computed trajectory, once it has made $T_{in}$ measurements at the view-point. We say that the system visits the view-point when $\mathbf{p} \in \{\mathbf{x}\mid(\mathbf{x}-\mathbf{p}_l)^T\mathbf{F}(\mathbf{x}-\mathbf{p}_l) \leq 1\}$, where $\mathbf{F}$ is a symmetric PSD matrix.

My idea was to formulate this as a mixed-integer model predictive control problem - as the title also states. I have some other formulations where I fix the time to reach the view-point and return time, but I would like to minimize the time to reach the view-point and hence time to return to the trajectory.

I have come up with the following formulation, with $N$ the prediction horizon

$$ \begin{split} \min_{\substack{\zeta_k\in\{0,1\},\mathbf{x}_k\in\mathbb{R}^{n_x}, \\\mathbf{u}\in\mathbb{R}^{n_u}, s_k\in\mathbb{R}}} \quad& \sum_{k=k_m}^{k_m+N}W_{\zeta}\zeta_k s_k + (1-\zeta_k)||\mathbf{x}_k-\mathbf{x}_{t,k}||_{\mathbf{W}_g}^2 \\ \mathrm{s.t.} \quad& \mathbf{x}_{k+1} = \mathbf{f}(\mathbf{x}_k,\mathbf{u}_k) \\ \quad& T_{in,k} \leq \sum_{k=k_m}^{k_m+N}\zeta_k \\ \quad& \left(\mathbf{p}_k-\mathbf{p}_l\right)^T\mathbf{F}\left(\mathbf{p}_k-\mathbf{p}_l\right) - s_k\leq 1 \\ \quad& 1 - s_k \leq M\zeta_k\\ \quad& \zeta_k \leq \zeta_{k-1} \quad \forall k=k_m,\dots,k_m+N \quad& \end{split} $$

My idea was to use the binary decision variables $\zeta_k$ to control which objective the system has at time step $k$. When $\zeta_k = 0$ it should minimize the distance to the trajectory and when $\zeta_k = 1$ it should minimize the distance to the view-point via the slack variable $s_k$. My idea was that $\zeta_k$ would be controlled through the constaint $1-s_k\leq M \zeta_k$, where $M$ is a large integer.

The formulation does not provide the correct results so there is something missing. When using Bonmin via Casadi it simply returns all $\zeta_k$ as zeros and the system subsequently just tracks the trajectory.

What am I missing? Could it be due to how $\zeta_k$ is used in that the solver finds no decrease in the cost when $\zeta_k=1$? Or is mixed integer programming not the right approach to this problem?

Mixed integer model predictive control for exploration and exploitation planning

0 Answers0