9

I have two strings, where one is a permutation of the other. I was wondering if there is an alternative to Hamming distance where instead of finding the minimum number of substitutions required, it would find the minimum number of translocations required to go from string a to string b.

My strings are always of the same size and I know there are no errors/substitutions.

Example:

1 2 3 4 5
3 2 5 4 1

This would give me two:

3 2 5 4 1 (start)
-> 3 2 1 4 5 
-> -> 1 2 3 4 5

If this is already implemented in R that would be even better.

Gilles 'SO- stop being evil'
  • 44,159
  • 8
  • 120
  • 184
user1357015
  • 205
  • 2
  • 5

2 Answers2

4

Finding the minimal distance is called the "Sorting By Translocation" problem. Part of an abstract from a paper:

"Given two signed multi-chromosomal genomes Pi and Gamma with the same gene set, the problem of sorting by translocations (SBT) is to find a shortest sequence of translocations transforming Pi to Gamma, where the length of the sequence is called the translocation distance between Pi and Gamma. In 1996, Hannenhalli gave the formula of the translocation distance for the first time, based on which an $O(n^3)$ algorithm for SBT was given. In 2005, Anne Bergeron et al. revisited this problem and gave an elementary proof for the formula of the translocation distance which leads to a new $O(n^3)$ algorithm for SBT."

What's called "translocation" here is called a transposition, i.e., a permutation of exactly two elements in a list, in traditional combinatorial language.

Bitwise
  • 316
  • 2
  • 4
0

We need to find the minimum number of transpositions that take one string $a$ to another string $b$, where $a, b$ are permutations. It looks like you are looking for the minimum distance between two given vertices $a, b \in S_n$ in the complete transposition graph, which is the Cayley graph of $S_n$ generated by the set of all transpositions.

Ashwin Ganesan
  • 1,388
  • 7
  • 10