Taking this post as starting point:
Number of permutations of AABBBCC, taking 7 letters at a time, when repititions are allowed
My case is slightly different, in the sense that, if my letters are, say, AABBC, the non-redundant permutations are of course $30 = \frac {5!} {2! 2! 1!}$:
AABBC
AABCB
AACBB
...
CBBAA
however, for my purposes, these 2 permutations are equivalent:
AABBC
BBAAC
and so are these:
CAABB
CBBAA
etc.
i.e. I can swap any groups of letters that appear the same number of times.
This is because I am interested in finding all the possible ways I can form ordered sets of groups of objects, where the identity of the objects themselves however does not matter: it only matters that they are the same.
So, to be clear, if I had AAB, all $3 = \frac {3!} {2! 1!}$ non-redundant permutations would be considered distinct:
AAB
ABA
BAA
Instead, if I had AABB, the $6 = \frac {4!} {2! 2!}$ non-redundant permutations:
AABB
ABAB
ABBA
BAAB
BABA
BBAA
are for my purposes 2 by 2 'equivalent', because e.g. AABB and BBAA both indicate putting 2 identical objects first and second, and the other 2 identical objects third and fourth; ABAB and BABA both indicate putting 2 identical objects first and third, and the other 2 identical objects second and fourth. And so on.
I hope I am making sense.
What I do not know, and would please like to have some advice about, is:
- is there a 'formal' mathematical definition for this kind of arrangement of letters or objects, with he specific type of equivalence I described?
- how do I calculate how many I should expect in total, given the distinct letters and the number of times they must appear?
- how do I generate all of them, without generating first all non-redundant permutations and eliminating a posteriori the equivalent ones?
For point 2, my current guess is that I need to start from the frequencies of each letter, and make a summary of how many letters there are, by the number of times they appear (their frequency).
E.g. if A, B, C appear $2, 2, 1$ times respectively, as in my initial example, I have:
$freq = 2: 2 \ letters$
$freq = 1: 1 \ letter$
Then I divide by the counts of letters (not by the frequencies).
So for AABBC, $30 = \frac {5!} {2! 2! 1!} \cdot \frac 1 {2 \cdot 1} = 15$.
This seems about right. If I use package arrangements in R:
permutations(c("A","B","C"), freq = c(2,2,1))
[,1] [,2] [,3] [,4] [,5]
[1,] "A" "A" "B" "B" "C"
[2,] "A" "A" "B" "C" "B"
[3,] "A" "A" "C" "B" "B"
[4,] "A" "B" "A" "B" "C"
[5,] "A" "B" "A" "C" "B"
[6,] "A" "B" "B" "A" "C"
[7,] "A" "B" "B" "C" "A"
[8,] "A" "B" "C" "A" "B"
[9,] "A" "B" "C" "B" "A"
[10,] "A" "C" "A" "B" "B"
[11,] "A" "C" "B" "A" "B"
[12,] "A" "C" "B" "B" "A"
[13,] "B" "A" "A" "B" "C"
[14,] "B" "A" "A" "C" "B"
[15,] "B" "A" "B" "A" "C"
[16,] "B" "A" "B" "C" "A"
[17,] "B" "A" "C" "A" "B"
[18,] "B" "A" "C" "B" "A"
[19,] "B" "B" "A" "A" "C"
[20,] "B" "B" "A" "C" "A"
[21,] "B" "B" "C" "A" "A"
[22,] "B" "C" "A" "A" "B"
[23,] "B" "C" "A" "B" "A"
[24,] "B" "C" "B" "A" "A"
[25,] "C" "A" "A" "B" "B"
[26,] "C" "A" "B" "A" "B"
[27,] "C" "A" "B" "B" "A"
[28,] "C" "B" "A" "A" "B"
[29,] "C" "B" "A" "B" "A"
[30,] "C" "B" "B" "A" "A"
1 is equivalent to 19; 2 to 20; etc.
You see already from this example, I could not stop at the first lexicographically enumerated 15 cases, because e.g. 6 is equivalent to 13.
Any ideas?
Am I perhaps missing something fundamental, could I use letters differently, transform...?
Thanks!