3

I come across with a problem of the form $y=Hx + z \in \mathbb{R}^m$, where $z\in \mathbb{R}^m$ is the noise vector, and $x \in \mathbb{R}^N$ is partially known. $H\in \mathbb{R}^{m \times N}$ can be regarded as a measurement matrix we ourselves have its design control. $y\in \mathbb{R}^m$ is the observation vector which is also known.

$x$ has exactly $K$ nonzero entries. It holds that $K \ll m \ll N$. Besides, we also know the values of those $K$ nonzero entries, denote them as $\beta_1,...,\beta_K$, however we do not know their index in $x$. The problem's goal is to find their index.

Is there any algorithm, e.g., some extended AMP-like schemes, etc. that solves this kind of problem?


What I tried is to convert the problem into a convex optimization problem. That is, we rewrite the problem as $y=HP\beta + z$, where $y,H,\beta=[\beta_1,...,\beta_K]^T$ are all known. And we require $P$ to satisfy (1) each column has exactly a single $1$, all other entries are zeros. (2) For each row, there can be at most one $1$, all other entries are zeros.

It turns out the convex hull of such matrices $P$ is quite easy to describe: $\{P\in \mathbb{R}^{N \times K}| P1\leq 1, 1^TP=1, P\geq 0 \}$. $P$ here can be regarded as a submatrix of a permutation matrix by selecting its first $K$ columns. And then we solve for a good $P$ by minimizing $\| y - HP\beta \|_2$, which is definitely a convex objective. (A heuristic here is that we can throw away most columns of $H$, because in probability, most columns are inactive, but I do not know how to take advantage of this.)

The above gives us a convex problem. However, I found that when $H\in \mathbb{R}^{10^3 \times 10^4}$ (aka, not very big scale), the algorithm is extremely slow. And personally I do not think convex optimization is the way to go.

1 Answers1

1

One could use accelerated projected gradient descent to solve this (Maybe even projected Newton Method). In order to be fast, one need an efficient method to project into the set of the convex hull of permutation matrices. See Orthogonal Projection onto the Convex Hull of Permutation Matrix.

Yet, when I come to think about it, I am not sure such $\boldsymbol{P}$ will be good as it will most likely yield something like:

$$ \boldsymbol{P} = \begin{bmatrix} 0.8 & 0.0 \\ 0.0 & 1.0 \\ 0.2 & 0 \end{bmatrix} $$

Which requires farther processing to yield a proper permutation matrix.
One solution is by setting $1$ to the maximum per column and zero all other values.
Since we use projected procedure this must happen at the end.
Yet is also means there is need for the actual projection, we can approximate it by alternating projection and be done with it.

So, I'd do something like:

  1. Accelerated (FISTA like) Gradient Descent.
  2. The actual projection will be approximated by alternate projection.
  3. Once the algorithm converges (The GD) apply quantization to $\boldsymbol{p}$ to set the solution.

The issue with the above quantization is it might cause having more than 1 maximum per row which is invalid as a sub matrix of a permutation matrix.

A way to solve it is by iterating on each row.
If it includes more than one column maximum, then leave the dominant and zero the rest. It must be done iteratively as it changes the maximum location on each zero.

The way I did it was finding the maximum of the matrix.
Then zero out the row and column of this maximum.
Keeping the index of the maximum in a list and iterating again.
Since in each iteration the maximum must be found in a different column it is enough to iterate this as the number of columns.
Once done, using the list of indices I set the array to one on those indices.

The full code is available on my StackExchange Mathematics GitHub Repository (Look at the Mathematics\Q4933604 folder).

Remarks:

  • The greedy algorithm might fail in some cases where all columns are zero before the number of iterations is exhausted.
  • In order to have the exact $ \boldsymbol{P} $ one must, probably, employ some Integer Programming methods.
  • Another approach could be employing an OMP / LS OMP (See Wikipedia - Matching Pursuit) approach.
Royi
  • 10,050
  • Thank you very much for your detailed answer.

    What I tried on my side is that: do an iterative "decoding". This means: run the optimization for many times, each time when we get a resultant matrix $P$ from the program, I just convert the largest element to 1 and the other entries in the same row/column to zero. However, from practice I don't think this works.

    – SouthChinaSeaPupil Jun 26 '24 at 07:56
  • I think the problem is that: all methods we discussed so far do not use much structure of the ideal truncated permutation matrix $P$. $P$ has many (crucial) properties we do not yet know how to impose to the problem formulation. For example, $P$ has all its rows/columns orthogonal, which makes $P^TP,PP^T$ being diagonal matrices. And, $P$ has $N-K$ out of its $N$ rows being all zeros. – SouthChinaSeaPupil Jun 26 '24 at 07:58
  • @南洋小學生, Can you share an example data? – Royi Jun 26 '24 at 16:39
  • In fact, I do not have any. I just randomly generating the $H$ matrix by entry-wisely put some Gaussian random variable. – SouthChinaSeaPupil Jun 27 '24 at 06:36