Binary matrix multiplication optimization problem

Question

I am looking for pointers to and names of computational approaches to solve a binary matrix optimization problem of:

$$ minimize: ||\mathbf{X}\mathbf{Y} - \mathbf{T}||_{L1} $$ where $\mathbf{X}$ and $\mathbf{Y}$ are binary matrices and the definition of the matrix multiplication allows for integers (is not mod-2).

$\mathbf{X}$ is $\{0,1\}^{H \times M}$
$\mathbf{Y}$ is $\{0,1\}^{M \times W}$
$\mathbf{T}$ is $[\![0,N]\!]^{H \times W}$

The only known is $\mathbf{T}$. The $M$ dimension of $\mathbf{X}$, $\mathbf{Y}$ is not set and can "grow" for a better solution. Eventually, there may be a desire to find an optimal smallest size of the M dimension of $\mathbf{X}$, $\mathbf{Y}$ with a given value of the objective function.

One approach I have considered is that the dimension $M$ is set by the largest value in $\mathbf{T}$ and the corresponding elements in $\mathbf{X}$, $\mathbf{Y}$ are initialized to $1$ and then an optimization algorithm begins. In this case, would I consider a vector input to an optimization algorithm (e.g. scipy.optimize.minimize) and reshape this $1 \times (H*M+W*M)$ vector into matrices when evaluating the objective function?

I have looked at the following computational packages but am not sure of the best functions within those packages:

YALMIP: MATLAB package for optimization and linear programming.
Python package cvxopt specifically ilp from cvxopt.glpk
scipy.optimize.minimize with bounds

I have also considered solving for $\textbf{X}$ (or $\textbf{Y}$) as all positive reals (constrained by some maximum) and then converting $\textbf{X}$ to a binary matrix by growing the M dimension of $\textbf{X}$ based on a "resolution of the discretization". For example, if a matrix element was found to be 0.50 and the resolution is set to 0.1 the resulting binary A matrix would have a run of 5 1s. To use this approach I would solve for the $\textbf{X}$ after an in random guess for $\textbf{Y}$ using a conventional matrix multiplication solver from numpy such as

> Y = np.random.randint(2,size=(3,4))
> Y_pinv = np.linalg.pinv(Y) 
> X = np.dot(T, Y_pinv)

Yet, since $\textbf{Y}$ is not optimized the solution is far from optimal.

This is a quadratic integer program. This is an NP hard problem. Check this https://en.wikipedia.org/wiki/Quadratic_programming#Integer_constraints — KBS, May 21 '22 at 14:39
Thank you @KBS, I agree with your assessment. However, I am not seeing how to recast this to the problem of minimize: $\frac{1}{2} \mathbf{x^T} Q \mathbf{x} + \mathbf{c^T} \mathbf{x}$ — ljk07, May 22 '22 at 13:31
You do not. First of all, you need to replace your cost by a sum of slack/lifting variables to remove the L1-norm, and relegate the terms in the $L_1$-norm into quadratic constraints. The use of slack/lifting variables is standard; see e.g. https://math.stackexchange.com/questions/4444439/minimize-trace-of-quadratic-inverse-lasso See also https://en.wikipedia.org/wiki/Quadratically_constrained_quadratic_program — KBS, May 22 '22 at 13:37

score 0 · Accepted Answer · answered May 22 '22 at 14:29

For fixed $M$, you want to minimize $$\sum_{i=1}^H \sum_{j=1}^W \left|\sum_{k=1}^M X_{i,k} Y_{k,j} - T_{i,j}\right|. \tag1$$ You can linearize the absolute value in $(1)$ by introducing nonnegative decision variables $E^+_{i,j}$ and $E^-_{i,j}$, yielding the problem of minimizing $$\sum_{i=1}^H \sum_{j=1}^W (E^+_{i,j} + E^-_{i,j}) \tag2$$ subject to $$\sum_{k=1}^M X_{i,k} Y_{k,j} - T_{i,j} = E^+_{i,j} - E^-_{i,j} \quad \text{for all $i$ and $j$}. \tag3$$ You can linearize this quadratic constraint $(3)$ by introducing binary (or just nonnegative) decision variables $Z_{i,k,j}$ to represent $X_{i,k} Y_{k,j}$ and imposing linear constraints \begin{align} \sum_{k=1}^M Z_{i,k,j} - T_{i,j} &= E^+_{i,j} - E^-_{i,j} && \text{for all $i$ and $j$} \tag4 \\ Z_{i,k,j} &\le X_{i,k} &&\text{for all $i,k,j$} \tag5 \\ Z_{i,k,j} &\le Y_{k,j} &&\text{for all $i,k,j$} \tag6 \\ Z_{i,k,j} &\ge X_{i,k} + Y_{k,j} - 1 &&\text{for all $i,k,j$} \tag7 \end{align} The linearized problem is to minimize $(2)$ subject to $(4)$ through $(7)$.

thanks. What is the name for solving this new linearized problem? Is it a linear programming problem? Does solving the linearized problem return X and Y separately or just Z? — ljk07, May 24 '22 at 02:15
It is a mixed integer linear programming (MILP) problem. The solver returns $X,Y,Z$. — RobPratt, May 24 '22 at 02:18

Binary matrix multiplication optimization problem

1 Answers1