1

I have a large block arrowhead matrix which has significant sparsity in the following pattern:

$\mathbf{M} = \left( \begin{array}{c|c} \mathbf{A} & \mathbf{B}^{\top}\\ \hline \mathbf{B} & \mathbf{D} \end{array} \right) = \left( \begin{array}{cccc|cc} \mathbf{A}_1 & & & & \mathbf{B}_1^{\top} \\ & \mathbf{A}_2 & & & \mathbf{B}_2^{\top} \\ & & \ddots & & \vdots \\ & & & \mathbf{A}_n & \mathbf{B}_n^{\top} \\ \hline \mathbf{B}_1 & \mathbf{B}_2 & \dots & \mathbf{B}_n & \mathbf{D} \end{array} \right)$

where $\text{det}(\mathbf{M}) \neq 0, \text{det}(\mathbf{A}) \neq 0 \text{ and }\text{det}(\mathbf{D}) \neq 0$. $\mathbf{A}$ is a block diagonal square matrix comprsing of the blocks $\mathbf{A}_1, \mathbf{A}_2, \dots, \mathbf{A}_n$ which are all invertible, square matrices. $\mathbf{D}$ and $\mathbf{M}$ are square matrices as well.

My objective is to figure out an efficient technique for computing $\mathbf{M}^{-1}$, which is a well-known problem. But, for some reason, I do not get the correct inversion. Any help in figuring out my mistake would be hugely appreciated.

I do the following to compute the matrix inverse:

$\mathbf{M}^{-1} = \begin{pmatrix} \mathbf{A}^{-1} + \mathbf{A}^{-1}\mathbf{B}^{\top}(\mathbf{D} - \mathbf{B}\mathbf{A}^{-1}\mathbf{B}^{\top})^{-1}\mathbf{B}\mathbf{A}^{-1} & -\mathbf{A}^{-1}\mathbf{B}^{\top}(\mathbf{D} - \mathbf{B}\mathbf{A}^{-1}\mathbf{B}^{\top})^{-1} \\ -(\mathbf{D} - \mathbf{B}\mathbf{A}^{-1}\mathbf{B}^{\top})^{-1}\mathbf{B}\mathbf{A}^{-1} & (\mathbf{D} - \mathbf{B}\mathbf{A}^{-1}\mathbf{B}^{\top})^{-1} \end{pmatrix}$

since the Schur complement of $\mathbf{A}$ in $\mathbf{M}$ also appears to be invertible. I do not use an efficient inversion of $\mathbf{A}^{-1}$ for now, which could possibly be done by inverting the blocks $\mathbf{A}_1, \mathbf{A}_2, \dots, \mathbf{A}_n$.

However, the problem is, after computing the inverse in such a manner, I get $\mathbf{M}\mathbf{M}^{-1} \neq \mathbf{M}^{-1}\mathbf{M} \neq \mathbf{I}$, where $\mathbf{I}$ is the identity matrix. I notice some very large values in the last few rows of $\mathbf{M}\mathbf{M}^{-1}$. Clearly, I am doing something wrong.

Please help me identify my mistake. Thanks in advance.

Audrey
  • 115
  • Is there a good reason to compute the inverse? In general, the inverse of a sparse matrix is not sparse, so you are losing the benefit of working with sparse matrices. – Jean-Claude Arbaut Aug 25 '21 at 07:52
  • @Jean-ClaudeArbaut: this happens to be a Hessian approximate of an NLS problem – Audrey Aug 25 '21 at 08:13
  • 1
    It could be an ill conditioned matrix. matlab/octave cond(M). https://au.mathworks.com/matlabcentral/answers/266467-when-is-a-matrix-ill-conditioned –  Aug 25 '21 at 08:18
  • @arthur: typical condition number of M is 1, hence well-conditioned, and appears to be full rank – Audrey Aug 25 '21 at 08:30
  • Ah, well, thanks to @arthur, I managed to figure out the problem, the matrix M is indeed ill-conditioned under certain circumstances (not always) and it is in those ill-conditioned cases where the inversion fails. Well-conditioned ones are being inverted fine. Thanks for the help, that solves the issue. – Audrey Aug 25 '21 at 10:02
  • Note also that $(M^{-1})_{2,1}$ should be $-(\mathbf{D} - \mathbf{B}\mathbf{A}^{-1}\mathbf{B}^{\top})^{-1}\mathbf{B}\color{red}{\mathbf{A}^{-1}}$, but I guess it's only a LaTeX typo. – Jean-Claude Arbaut Aug 25 '21 at 11:25
  • @Jean-ClaudeArbaut: Yes, indeed a typo... rectified now. – Audrey Aug 25 '21 at 16:12
  • 1
    @AudreyFridean Why do you need the inverse of this approximate Hessian? This smells a bit like an XY problem. – Hyperplane Aug 25 '21 at 19:59
  • @Hyperplane: sure, I guess I should have been clearer... the Hessian appears while minimising a cost using non-linear least squares, with a Levenberg-Marquardt-like method. Would you say there's some way to completely avoid the inversion? – Audrey Aug 27 '21 at 10:03
  • 1
    @AudreyFridean The Levenberg-Marquardt method I am familiar with doesn't need the Hessian, only the Jacobian. But in any case, in optimization methods like Newton's method, which, on paper, reads $x' = x - ^{-1}f(x)∇f(x)$, in practice, $^{-1}f(x)$ is never explicitly computed. Rather, one solves the linear system $^{-1}f(x)Δx = -∇f(x)$ for $Δx$ instead. Here, you can take full advantage of the sparsity since it makes Hessian-vector products fast. – Hyperplane Aug 27 '21 at 15:59

0 Answers0