3

I’m trying to make a neutal network with dynamic architecture, meaning that the weight matrices will be added/removed dynamically when they are simple permutation matrices and thus don’t actually do any computation. However, I’ve run into an issue, I need a differentiable way to determine ”how close” the weight matrices are to being a permutation matrix. This measure will then be added to the loss function as a regularization term.

I know how to find the closest permutation matrix by making a cost matrix as one of the answers from here: Nearest signed permutation matrix to a given matrix $A$

However, if I use the norm of the cost matrix as a measure of ”permutivity” the simple 0 matrix will always give a lower measure (loss) than an actuall permutation matrix. Whenever I try to find or come up with a measure I always run into this same problem, there’s always some other matrix which gives a lower score than a permutation matrix.

Do any of you guys know of a ”permutation matrix measure” pr do you have any brilliant ideas on how to make one? Keep in mind that it has to be differentiable to be usefull,so no max functions or anything like that.

Update: I have found one measure that does what I need, but it’s too computationally heavy to use. I’ll write it here anyway, since it gives insight to what I’m looking for and might spawn new ideas:

$ \text{”permutivity”} = \prod_{i = 1}^{n} \sum_{j = 1}^{m}\sum_{k = 1}^{m} (a_{j,k}-{p_{i}}_{j,k})^2 $

Where $p_i$ is the ”i:th” permutation matrix of size MxM. So basically the product of the L2 distance of all the different permutation matrices. As you can imagine, this is not practical for several reasons, first of all, the number of permutation matrices are m! and for neural networks m is usually fairly large. Secondly, the product would explode causing numerical overflows when trying to calculate it. The second issue can be remedied using some sort of ”cap” function around the sums, but I can’t see any simple way to fix the first problem :(

  • Perhaps you could replace the minimum with which you would find the nearest matrix with a smooth minimum – Ben Grossmann Mar 19 '22 at 20:18
  • @BenGrossmann Very interesting idea, though I don’t think it would work since the hungarian algorithm used to find the closest pertubation matrix using the cost matrix relies on the subtraction of the minimum value being zero and it only finds one of all possible matrices, so using the smooth minimum would still only go towards the closest matrix, but ”smoothly”. But it could possibly be used in other ways :) I’ve come up with one meassure that actually works, but it’s too computationally heavy to use, I’ll post it in the question though since it’s technically a solution. – Beacon of Wierd Mar 19 '22 at 22:49
  • 1
    Text in equations should be written using \text{...} so it's not as a sequence of variables. For italicized text in equations, you can use \mathit{...}. As you can see, just writing the text as variables gives it the wrong spacing. – joriki Mar 20 '22 at 06:26

1 Answers1

2

How about

$$ \lambda\sum_{i,j}a_{ij}^2\left(1-a_{ij}\right)^2+\mu\left(\sum_i\left(\sum_ja_{ij}-1\right)^2+\sum_j\left(\sum_ia_{ij}-1\right)^2\right) $$

This is $0$ only if all entries are $0$ or $1$ and all row and column sums are $1$, i.e. if $A$ is a permutation matrix.

joriki
  • 242,601
  • 1
    This one looks like it would work great :D I’ve been trying a lot of things like the second part (getting the row and column sums = 1) but I never thought about simply adding the 0/1 condition as another sum, it’s genious :D Thank you, and super easy to compute :) Now I just need to find good values for the coefficients so there are no local minimas inbetween permutation matrices and matrices with only 1/0 or row/column sums to 1 :) – Beacon of Wierd Mar 20 '22 at 10:03