How to count all the ordered lists (of any length) that can be made from the letters of a given word? Let's denote this by $f(w)$. Is there a better way than the following (grouping by how many of each repeated character appears in the list)?
Given a word $w$ of length $n$, let $\textbf c$ be the vector of its character counts. Let $m$ be the number of $1$'s in $\textbf c$. Form the vector $\textbf t_{tot}$ by dropping $1$'s from $\textbf c$.
$$f(w) = \sum_{\textbf t} \sum_{k=0}^{n-|\textbf t|_1} {k+|\textbf t|_1 \choose k \space \dots \textbf t} (mPk) $$
where $t$ runs over all integer vectors satisfying $ \textbf 0 \leq \textbf t \leq \textbf t_{tot}$ component-wise. I call these "subtakes". The notation $|\textbf t|_1$ means the sum of components of $\textbf t$. The multinomial coefficient has $k$ and the elements of $\textbf t$. The notation $mPk = (m)_k = m(m-1)\dots(m-k+1)$.
Example: $w = \text{"missisippi"}$.
$\textbf c=(1, 4, 3, 2)$
$\textbf t_{tot} = (4,3,2)$
$\textbf t$ runs over [(0, 0, 0), (0, 0, 1), (0, 0, 2), (0, 1, 0), (0, 1, 1), (0, 1, 2), (0, 2, 0), (0, 2, 1), (0, 2, 2), (0, 3, 0), (0, 3, 1), (0, 3, 2), (1, 0, 0), (1, 0, 1), (1, 0, 2), (1, 1, 0), (1, 1, 1), (1, 1, 2), (1, 2, 0), (1, 2, 1), (1, 2, 2), (1, 3, 0), (1, 3, 1), (1, 3, 2), (2, 0, 0), (2, 0, 1), (2, 0, 2), (2, 1, 0), (2, 1, 1), (2, 1, 2), (2, 2, 0), (2, 2, 1), (2, 2, 2), (2, 3, 0), (2, 3, 1), (2, 3, 2), (3, 0, 0), (3, 0, 1), (3, 0, 2), (3, 1, 0), (3, 1, 1), (3, 1, 2), (3, 2, 0), (3, 2, 1), (3, 2, 2), (3, 3, 0), (3, 3, 1), (3, 3, 2), (4, 0, 0), (4, 0, 1), (4, 0, 2), (4, 1, 0), (4, 1, 1), (4, 1, 2), (4, 2, 0), (4, 2, 1), (4, 2, 2), (4, 3, 0), (4, 3, 1), (4, 3, 2)]
$f(w) = 38848$.
I have coded this in SageMath:
import itertools
from collections import Counter
def subtakes(a):
yield from itertools.product(*[range(v+1) for v in a])
def countOrdLists(w):
lC = list(Counter(w).values()) #char counts
m = sum(1 for x in lC if x==1) #number of single chars
ret = 0
for t in subtakes([v for v in lC if v>1]):
ret += sum(multinomial([k]+list(t))*falling_factorial(m,k) for k in range(len(w)-sum(t)+1))
return ret
def countOrdListsCheckWithBruteForce(w):
ret = 0
for r in range(len(w)+1):
for p in Permutations(w, r):
ret += 1
#print(p)
return ret
word = "missisippi"
print (countOrdLists(word))
print (countOrdListsCheckWithBruteForce(word))
Idea: The order of elements in $\textbf t$ doesn't matter so we could restrict the first sum to be over non-decreasing $\textbf t$ (if we also first sort the vector $\textbf t_{tot}$). But how to count how many non-sorted $\textbf t$ correspond to a particular sorted one?
This code generates all sorted subtakes, but the coefficient that goes with it should be somehow calculated along as we generate the take.
def subtakesD(a):
a = sorted(a)
def make(b):
if len(b)==len(a):
yield tuple(b)
return
first = 0 if len(b)==0 else b[-1] #ensure increasing
last = a[len(b)]
for v in range(first, last+1):
yield from make(b+[v])
return
yield from make([])
return
EDIT
The generating function solution coded in SageMath:
from collections import Counter
def countOrdListsGF(w):
c = Counter(w).values()
R.<z> = QQ[]
f = prod(sum(1/factorial(k)z^k for k in range(m+1)) for m in c)
#return integral(e^(-x)f(x), x, 0, infinity)
return sum(a*factorial(j) for j,a in enumerate(f.list()))