6

So I have a statistical learning algorithm in which D is a diagonal matrix that changes each iteration while A stays the same. I'm looking for a fast way to invert ADA' each iteration which ends up being a .9 million by .9 million sized matrix.

A is m by n with m < n.

My thoughts have been drawn to doing an economic SVD on A to get A=SVU' (and V ends up being a square diagonal matrix) at which point I only need to worry about inverting the inner U'DU term and U'*U=I, I feel like there should be something possible but I can't figure it out.

Any ideas?

some additional notes: A is fairly sparse, preprocessing things like the SVD of A can take as long as necessary...etc

MATLAB code that i've tested to show that the suggested approach doesn't work (unsure how to format this):

rows=90; cols=120; G=rand(rows,cols); [X,Y,Z]=svd(G,'econ'); A=Z'; %the above was just to generate A s.t. A'A=I, as in second paragraph above d=rand(1,cols); D=diag(d); M=A*D*A'; Minv=M^(-1) %something to compare with %[U,E,V]=svd(A); %also tried this, it didn't work either [U,E,V]=svd(A,'econ'); Einv=E'; Einv(1:rows,1:rows)=diag(1./diag(E)); %calculate inverse of E Ap=V*Einv*U'; Minv2=Ap'*D^(-1)*Ap; max(max((Minv-Minv2).^2)) %did it work? (no)

  • 1
    Is $ADA^{T}$ positive definite? How sparse is $ADA^{T}$? Can the matrix be reordered to reduce fill-in in the factors? Do all of the entries in $D$ change in each iteration or just a few? Do you actually need the inverse of $ADA^{T}$ or just to solve systems of equations involving this matrix? Problems like these are dealt with in interior point methods for linear programming- there's a large literature on that. – Brian Borchers Jan 16 '19 at 22:29
  • @BrianBorchers its for the algorithm from here: https://arxiv.org/pdf/1702.04300.pdf so its a convex optimization problem rather than a system of linear equations though it does have a closed form solution which is why I'm trying to invert $ADA^{T}$. I don't think $ADA^{T}$ is either PSD or sparse... – Charles Hernandez Jan 16 '19 at 23:07
  • If the entries in $D$ are positive than $ADA^{T}$ will be positive definite. If entries in $D$ can be negative then the matrix might be indefinite. – Brian Borchers Jan 16 '19 at 23:18
  • Just an immediate observation, the A in your code is not mxn, but is nxn and orthogonal instead. Is that intended? – Klaas van Aarsen Jan 17 '19 at 09:36
  • @IlikeSerena I just tested again to verify, it works correctly. You'd be right if it was a normal SVD, but Its an economic SVD at the start so for an m by n matrix with m<n you get 3 matrices, (mxm)(mxm)(nxm) and i'm setting A to to transpose of that nxm matrix. note: this is just so that the columns of A are orthagonal to see if that has any effect, the code does the same thing if you just set A to be a random mxn matrix. – Charles Hernandez Jan 17 '19 at 15:51
  • My apologies, my solution does not work. I have updated my answer with an explanation. – Klaas van Aarsen Jan 18 '19 at 22:35

1 Answers1

2

How about: $$(ADA')^{-1} = {A^+}' D^{-1} A^+$$ where $A^+$ is the Moore-Penrose pseudoinverse?

Note that if $A=U\Sigma V'$ is the economic SVD, then $A^+=V\Sigma^{-1}U'$.

Btw, I'm assuming that your matrices are real, that $A$ is of rank $m$, and $D$ of rank $n$.


Edit: Turns out that this doesn't work. I don't have a working solution (yet), only the following observations.

Let $\boxed{\quad A\phantom'\quad}=\boxed{U\phantom'}\,\boxed{\Sigma\phantom'}\,\boxed{\quad V'\quad}$ be the economic SVD.

Then we can write the full SVD as $\boxed{\quad A\phantom'\quad}=\boxed{U\phantom'}\,\boxed{\Sigma\phantom'\quad 0\phantom'}\begin{array}{|cc|}\hline V'\\ \hline \quad\! N'\!\!\quad\\\hline\end{array}$.

Moreover, the columns of $U$ form the column space of $A$. And the columns of $N$ form the null space of $A$.

Consequently we can write either: $$(ADA')^{-1} = (U\Sigma V'DV\Sigma' U')^{-1}=U\Sigma^{-1}(V'DV)^{-1}\Sigma^{-1}U'$$ or: $$(ADA')^{-1} = \left(U(\Sigma\mid0)\binom{V'}{N'}D(V\mid N)\binom{\Sigma'}{0} U'\right)^{-1} = U\left((\Sigma\mid0)\binom{V'}{N'}D(V\mid N)\binom{\Sigma}{0}\right)^{-1}U'$$ In the second form we can say that: $$\left(\binom{V'}{N'}D(V\mid N)\right)^{-1} = \binom{V'}{N'}D^{-1}(V\mid N) $$ However, unfortunately we cannot leave out the $N$ submatrices and we have: $$(V'DV)^{-1}\ne V'D^{-1}V $$ This is why my original suggestion does not work.

  • Ignore that comment, didn't realize there was an edit deadline.

    So if you do that you would get

    $(ADA') ({A^+}' D^{-1} A^+)=(U \Sigma V' D V \Sigma U') \times (U \Sigma^{-1} V' D^{-1} V \Sigma^{-1} U')$

    so the U' and U cancels as do the $\Sigma$'s but I believe $V V' \neq I$ since V is n by m with n>m. i.e. you'd need V to have n orthagonal vectors of size m, no?

    – Charles Hernandez Jan 17 '19 at 01:09
  • Yes. U is unitary, $\Sigma$ is square and diagonal, and with the full unitary V we can invert VDV'. – Klaas van Aarsen Jan 17 '19 at 01:12
  • That is, consider the full SVD to complete the multiplication where V is unitary. Afterwards we can discard the part of V again that corresponds to the null space of A. – Klaas van Aarsen Jan 17 '19 at 01:18
  • Sorry if i'm making this harder than it has to be....

    I'm not sure what your last comment is implying. Is that the definition of economic SVD? or are you saying something more nuanced.

    – Charles Hernandez Jan 17 '19 at 01:24
  • U is exactly the column space of A. In a full SVD, V contains the column space of A' plus the null space of A'. In an economic SVD that null space is left out since it serves no purpose. But with it V is unitary, which facilitates a proof. – Klaas van Aarsen Jan 17 '19 at 01:44
  • Even if $A$ is sparse the SVD may not be sparse. This means to multiply by the inverse will require as much work as solving the original linear system and not be as stable (assuming the OP wants to solve a system and not literally compute the inverse) – overfull hbox Jan 17 '19 at 02:42
  • per your edit: Yeah thats what i was trying to say in my first comment, $V V' \neq I$. I've been thinking about this alot and I'm not sure that its possible. I can't even image a decomposition that would work. Because of the shape of the matrix ('wide') if you decompose A into several matrices you will always have to have one of those matrices also be 'wide'. That wide matrix ruins the piecewise inversion because it has no left pseudoinverse which is what you need for the A' term of $ADA'$ to cancel out. – Charles Hernandez Jan 19 '19 at 03:07