Using the excellent indications given by @Rodrigo de Azevedo under the following form where $\|\|_F$ denotes the Frobenius norm (see remark at the bottom), your issue is equivalent to :
$$\text{minimize} \ \|AP-B\|_F^2=\|AP\|_F^2-2\langle AP,B\rangle+\|B\|_F^2$$
$$\text{minimize} \ \|AP-B\|_F^2=\|A\|_F^2-2 \ \text{trace}(P^TA^TB)+\|B\|_F^2$$
Have a look at https://en.wikipedia.org/wiki/Frobenius_inner_product
As $\|A\|_F^2+\|B\|_F^2$ is constant, it remains to :
$$\text{maximize} \ \text{trace}(PC) \ \text{with} \ C:=A^TB$$
on the group of permutation matrices $P$.
Edit :
Operation trace($PC$) can be understood on the particular $3 \times 3$ case :
$$\text{trace} \begin{pmatrix}1&0&0\\0&0&1\\0&1&0\end{pmatrix}\begin{pmatrix}c_{11}&c_{12}&c_{13}\\c_{21}&c_{22}&c_{23}\\c_{31}&c_{32}&c_{33}\end{pmatrix}=c_{11}+c_{23}+c_{32},$$
A straightforward generalization to the $n \times n$ case explains that the objective is to find, among all sums :
$$s_{\sigma}=c_{1\sigma(1)}+c_{2\sigma(2)}+\cdots+c_{n\sigma(n)} \ \text{,} \ \sigma \in \frak{S}_n$$
the one which is maximal.
Fortunately, this can be done by the efficient Hungarian algorithm applied on matrix $C$, or more exactly because its direct form deals with minimization, an adapted version of it for a maximization context.
Why do we say efficient ? Because this algorithm has complexity $O(n^4)$ instead of the $O(n!)$ complexity of the brute force approach.
Remarks :
In fact, operation $A \to AP$, where $P$ is a permutation matrix, provides a permutation of columns, which is clearly equivalent to a permutation of lines provided by $A \to PA$.
Connected : https://math.stackexchange.com/q/175893.