Looking for an algo to "sorta" diagonalize a similarity matrix.

Question

I've got a big fat similarity matrix. The rows and columns represent people, and the values represent some positive measure of their closeness (0 meaning no connection at all). The n-th row and n-th colum corresponds to the same person - thus the matrix is square.

I'm looking for an algorithm to find some permutation of rows/columns such that the resulting matrix has as much "mass" aligned towards the diagonal as possible. The goal is find a matrix so that average closeness of neighbors (neighboring rows/columns) is maximized.

The ultimate goal is to use this as a sort of clustering algorithm.

look on google for "Extremal_optimization"
your problem can be optimized by this algorithm, but extremal opt. is not deterministic, so it doesnt always lead to the best solution possible — Frank, Sep 17 '14 at 21:38
i'm working on a stochastic algorithm for your problem, since it interested me, are you interested in approximate but fast methods or only on an exact solver? — Frank, Sep 17 '14 at 21:45
Nice question! Presumably the matrix is symmetric as well? More specifically, the entry in position $i,j$ is the same as that in $j,i$? — Stephen, Sep 17 '14 at 22:23
Also, is there any sort of "triangle inequality" relating the numbers appearing in positions like $(1,2)$ and $(2,3)$ with that appearing in position $(1,3)$? In real world applications there probably will be such relations. — Stephen, Sep 17 '14 at 22:25
@FrancescoAlem. An exact solution is better, but if this ends up being an NP problem, then I'll settle for an algorithm that gets me close. — John Berryman, Sep 18 '14 at 02:13
@Stephen - so in other words "can the similarity be used as a metric" so that the space of "people" is a metrics space?" I've been wondering that myself. The matrix is symmetric for now, though I would like to loosen that for the case that A follows B but B does not follow A. — John Berryman, Sep 18 '14 at 02:15
There is a caveat that shows my questions is not quite ideally posed - what of the circumstance that cliques A, B, and C are all equally "close" to one another? In this case you can still find a matrix that is optimal according to the specifications above - but it wouldn't be useful in identifying groups because I suspect two of these groups would be interleaved so that the optimization criteria is met. I guess it depends what criteria you use. In any case, the above definitions assumes that group members can basically be placed on a line. — John Berryman, Sep 18 '14 at 02:22
@JohnBerryman, got it :) i managed to formalize it on paper, now i'm making a prototype in c++ to see how does it perform on a live scenario! if it turns out good, i'll write an answer hoping you like it. — Frank, Sep 18 '14 at 12:06
unfortunately it breaks down once it reaches 1500x1500 dense matrixes, and you probably need to use much much bigger one's... also i tried to implement Extremal Optimization, and it doesnt work very good, so i've settled for a greedy algorithm, but i'm not sure of the quality of the permutation that it gives... so if you want i can give you a slow and 'not sure to actually work' algorithm... — Frank, Sep 20 '14 at 13:55
Ha. Thanks for trying @FrancescoAlem. I've been thinking about it and the problem might generalize to a Minimal Spanning Tree problem. As stated here - rearranging rows in a matrix implies building a minimal graph in which each node has one or two neighbors. A MST imposes a hierarchy on the members which is more fitting since I'm hoping to group them. — John Berryman, Sep 20 '14 at 19:39
That's good news, there are lots of fast algorithms to tackle MST's! if something comes up, please let me know! this problem is fascinating! — Frank, Sep 20 '14 at 23:06
What do you know; I'm working on a similar question and have stumbled upon this. Have you made any progress since 2014? I started with swapping random rows/columns to increase the "neighbor similarity score" but this ends up stuck when several strong local groups form. Also, there is no particular benefit to 1-dimensional structure here, maybe we can try 2-dimensional neighbor structures to make the result more flexible. Looks a bit like Self-Organizing Maps to me! — PA6OTA, Dec 27 '18 at 01:56

Looking for an algo to "sorta" diagonalize a similarity matrix.

0 Answers0