10

I have a system of $m$ linear equations on $n$ variables, which I'm representing as $Ax=b$, with $A$ an $m\times n$ matrix representing the equations and $b$ an $\mathbb R^m$ vector representing the constants of the equations. I'm given that there exists a solution $s\in\mathbb R^n$ (i.e., $As=b$) and that $m<<n$.

My goal is the following:

  1. Find which variables are determined (i.e., for which there are no solutions with values different than the one in $s$).
  2. For each variable in 1., find a minimal subsystem of equations which determines that variable. By minimal, I mean the following: the subsystem determines the variable as in 1., and no proper subset of the subsystem does this. I want to find any such minimal subsystem: which one I pick is irrelevant to me.

Progress:

I've solved 1. by computing the nullspace of $A$, and looking for those variables for which none of the basis vectors of the null-space have a non-zero coefficient.

I've attempted to solve 2. by computing the pseudo-inverse of $A$ and looking at which coefficients in the row corresponding to the chosen variable are non-zero. This approach doesn't work, though, since the pseudo-inverse $P$ minimizes the Frobenius norm among all matrices $X$ such that $X\cdot b=s$. This norm is an $L_2$ norm, but what we really want is an $L_0$ norm, since we want to minimize the number of non-zero elements in the matrix. Of course, finding the matrix minimizing the $L_0$ norm is an NP problem, so that's not a valid approach for me.

Are there any other approaches I'm missing? Perhaps some heuristic methods? I'm struggling to find anything on this subject.

Rushabh Mehta
  • 13,845
  • 2
    Good lord, I haven't seen this question somehow, and thanks to that meta post I got here. I'm just letting you know I retain an interest, and I'll get back when I can. Thanks and +1. – Sarvesh Ravichandran Iyer Sep 06 '21 at 11:49
  • 1
    @TeresaLisbon If you want, you can post a dummy answer and I'll give you the bounty. It'll go to waste either way. – Rushabh Mehta Sep 06 '21 at 14:38
  • 1
    I'm extremely sorry that I don't even have enough for a dummy answer. What I will promise you is my thoughts on this problem, possibly after the question expires, although there is a grace period so I'll try to get it in before then hopefully! – Sarvesh Ravichandran Iyer Sep 06 '21 at 14:48
  • 1
    @TeresaLisbon Would it be alright to indicate on meta that if someone answers this question, I'd offer a 100 point bounty? I don't want to pay for another bounty without any attention again. – Rushabh Mehta Sep 08 '21 at 12:08
  • 1
    I will do the job, although I might take some time. I feel I'm qualified enough to research and provide an answer to you, and I've been alerted to this question thanks to your earlier bounty, so I will definitely prioritize your question over a few others. I'll try to get it done soon, you can avoid a second bounty. – Sarvesh Ravichandran Iyer Sep 08 '21 at 12:10
  • 1
    @TeresaLisbon Thank you so much! – Rushabh Mehta Sep 08 '21 at 12:11
  • 1
    Welcome! I wanted to add some initial thoughts but I haven't read the question enough to get this out of the way. The question is clear enough, so I'll be writing an answer directly. – Sarvesh Ravichandran Iyer Sep 08 '21 at 12:12
  • 1
    I might be missing something... but why wouldn't the simple approach of "keep removing one equation at a time" work? I mean, choose one single variable that is determined by the full system of equations. Remove the first equation and see if the variable is determined by the smaller system. If it is, continue with this smaller system. Otherwise, try to remove the next equation instead. After $m$ tests (each of them corresponds to solving subproblem 1 for some system of equations), you will have a system in which removal of any single equation results in the variable no longer being determined. – Peter Košinár Sep 09 '21 at 23:31
  • 2
    @PeterKošinár It works, but it's very computationally expensive. A rough time estimate would be $O(n\cdot m^2\cdot m^2n)=O(n^2m^4)$, which is awful. With my data, in which $m\sim10^6$ and $n\sim10^9$, it's just not possible. – Rushabh Mehta Sep 10 '21 at 01:51
  • 1
    I'm not convinced with my answer, sorry. I've been working on this for some time, but , in my opinion, I've not reached anything which satisfies me. I want to know if you will accept a partial answer , and I will try delivering what I can. – Sarvesh Ravichandran Iyer Sep 19 '21 at 07:25
  • 1
    @TeresaLisbon Yes, any work is useful. – Rushabh Mehta Sep 19 '21 at 16:04
  • 1
    Thanks, will try to write one up soon! – Sarvesh Ravichandran Iyer Sep 19 '21 at 16:04
  • 1
    @TeresaLisbon Added another bounty to keep you motivated :) – Rushabh Mehta Sep 29 '21 at 14:35
  • 2
    @DonThousand Thank you once again, and I'm sorry to keep you waiting. I always take time over answers, even when I have them in place, so I'll try to get this one wrapped up when I can. – Sarvesh Ravichandran Iyer Sep 29 '21 at 23:39
  • 2
    I give up : I've tried hard to look at the matrix part of things, but everywhere I go I'm seeing only the minimization of the $L_0$ norm for the vector $s$ i.e. picking $s$ such that it has the least number of non-zero entries. Not once do I unfortunately see the minimization over matrices. I know that matrices are vectors, but it's still quite surprising that nobody considered the same problem as you. I found some randomized algorithms and $L_0$-regularization algorithms for the vector case : these could work for the matrix case, but I'm not sure, that's the problem. – Sarvesh Ravichandran Iyer Sep 30 '21 at 04:08
  • 2
    For example , it is known that changing $L_0$ to $L_p$ for $p \approx 0$ gives a feasible problem that is a non-linear program and has methods available. One can provide other regularizations as well ($L_1$, $\frac{L_2}{L_1}$ etc.) Next, one can "Bayesian-learn" the data by fitting it into a certain distribution, for which the $L_p$ regularized NLP runs quickly and converges to an $L_0$ global minimizer as $p \to 0$. This is what I could find with about 2-3 hours of work, and I'm unhappy that I couldn't bring more to the table. – Sarvesh Ravichandran Iyer Sep 30 '21 at 04:11
  • 2
    @TeresaLisbon No problem, I appreciate your taking the time to look at it. I was surprised to not find anything in the literature about this either. I might post this to Overflow if no one shows up in the bounty period. – Rushabh Mehta Sep 30 '21 at 12:16
  • 1
    Maybe it helps to think about the problem in a slightly different manner: It can be shown that $x_i$ is uniquely determined if and only if the unit vector $e_i \in \mathbb{R}^n$ is in the row span of $A$. (I can provide a proof if you're interested, but it is relatively straightforward to see) This way, your second problem can be solved by finding a minimum set of rows whose span contains $e_i$. I am personally not aware of an efficient way to do this, but maybe someone else does! – Andreas Lenz Oct 01 '21 at 14:54
  • 2
    @AndreasLenz That seems a bit more difficult to solve than my formulation. But thanks for the help! – Rushabh Mehta Oct 01 '21 at 15:08
  • 1
    I am sorry, I have not read all the comments, but have you considered an SVD of $A$? If I get it right, you are looking for the linear combination of singular vectors corresponding to non zero singular values. – Surb Oct 03 '21 at 21:13
  • 1
    @Surb As I indicated in my question, for part 1, I solved it via computing the null space, which I do via the SVD. So if there's a way of solving part 2 with the SVD, I'm on board! – Rushabh Mehta Oct 03 '21 at 22:51
  • 2
    Just an idea for a heuristic approximation: What if instead of solving for the explicit least-squares minimizer (given by the pseudoinverse), you instead solve the ridge regression $\min_s | As - b | + \lambda |s|$ (to obtain a `perturbed' psuedoinverse)? The thought here is as you scale $\lambda$ higher, the free variables in $s$ should go to 0, and then the relations you're looking for might fall out from from simple row operations on $A$? – Shil B. Oct 04 '21 at 17:38
  • 1
    @ShilB. Not a bad idea. I'll try to work on it after my job ends. – Rushabh Mehta Oct 04 '21 at 17:42

1 Answers1

1

Of course, if $s_0$ is any solution then $s$ is a solution iff $A(s-s_0)=0$, and we can pass to the question of which rows of $A$ imply that some coordinate $x_i$ of every $x$ with $Ax=0$ is zero.

Suppose $Ax=0$ implies $x_1=0$. Set $x_1=1$ in the equations given by $A$. The equations are now incompatible. This is still a linear system, in one less variable, given by some $Mv=c$. (If $A$ had columns $v_1, ..., v_n$ then $c=-v_1$ and $M$ has columns $v_2, ..., v_n$.) By Fredholm alternative, this means that some linear combination of the equations gives 0=1, i.e. there is a subset of rows of $M$ whose span is lower-dimensional than the span of the corresponding rows of $A$. We are looking for a minimal incompatible subsystem, i.e. a minimal such collection.

Start with any such collection. Suppose the corresponding rows of $M$ spanning a space $V$. Select from these rows a basis of $V$, and add more rows one by one, until the dimension spaned by corresponding rows of $A$ jumps. Then pick a subset that is a basis for the space spanned by these rows of A. At that point, we have a minimal collection of equations implying $x_1=0$.

Max
  • 14,503