4

I've developed the following backtrack algorithm, and I'm trying to find out it time complexity.

A set of $K$ integers defines a set of modular distances between all pairs of them. In this algorithm, I considered the inverse problem of reconstructing all integer sets which realize a given distance multiset. i.e. :


Inputs: $D=\{p_i−p_j \mod N, i≠j \},K $
Output : $P=\{p_1,p_2,...,p_K\},\qquad p_i \in \{0,1,2,...,N-1\},\qquad p_i > p_j $ for $i>j$

Simply saying, the algorithm puts $K$ blanks to be filled. Initially, puts 1 in the first blank. For the second blank it looks for the first integer that if we add to P, it doesn't produce any difference exceeding the existent differences in $D$. Then, it does so, for next blanks. While filling a blank if it checked all possible integers and found no suitable integer for that blank, it turns back to the previous blank and looks for next suitable integer for it. If all blanks are filled, it has finished his job, otherwise it means that there weren't any possible $P$'s for this $D$.

Here's my analysis so far. Since the algorithm checks at most all members of $\{2,...,N\}$ for each blank (upper bound) there is $N-1$ search for each blank. If each visited blank was filled in visiting time, the complexity would be $O((K-1)(N-1))$ since we have $K-1$ blank (assuming first one is filled with 1). But the algorithm is more complex since for some blanks it goes backward and some blanks may be visited more that once. I'm looking for the worst case complexity i.e. the case that all blanks are visited and no solution is found.

Mahdi Khosravi
  • 171
  • 2
  • 2
  • 9

2 Answers2

6

The running time of your algorithm is at most $N (N-1) (N-2) \cdots (N-K+1)$, i.e., $N!/(N-K)!$. This is $O(N^K)$, i.e., exponential in $K$.

Justification: There are $N$ possible choices for what you put into the first blank, and in the worst case you might have to explore each. There are $N-1$ choices for the second blank, and so on. You can draw a tree of the choices made: the first level shows the choice of what to put in the first blank, the second level shows the choice of what to put in the second blank, and so on. The degree of the root is $N$; the degree of the nodes at the second level is $N-1$; and so on. The number of leaves is the product of the degrees at each level, i.e., $N (N-1) (N-2) \cdots (N-K+1)$. In the worst case, your algorithm might have to explore every possible node in this tree (if it is not able to stop early before reaching the $K$th level and backtrack from a higher-up node). Therefore, this is a valid upper bound for the running time of your algorithm.

If you want a tighter analysis, here is the exact worst-case running time (not an upper bound). The number of leaves in your search tree, in the worst case, is the number of strictly increasing sequences of length $K$ over $\{1,\dots,N\}$ that start with 0. (We can assume without loss of generality that the first blank contains a 0, as you point out, which is why we can restrict to sequences that start with a 0.) That number is exactly $C(N-1,K-1) = (N-1)!/((K-1)!(N-K)!)$.

This is a tighter analysis, but it doesn't save us from exponential running time. When $N\gg K$, $C(N-1,K-1)$ is still $O(N^K)$, i.e., exponential in $K$.

That said, evaluating your algorithm experimentally (by testing it on some real data sets) would probably be a better way to evaluate your algorithm than trying to derive a worst-case running time. You might want to compare it to the performance of translating your problem into a SAT instance and using an off-the-shelf SAT solver. Depending upon the value of $N$ and $K$, there might be other better alternatives as well.

See also the following question for a closely related problem, and for algorithms to solve it:

D.W.
  • 167,959
  • 22
  • 232
  • 500
2

My first piece of advice is to eliminate your GOTOs. They make analysis of algorithms quite difficult, and are generally considered bad practice in programming.

Here are some overall strategies:

Can you find a bound on the number of times each loop runs? Try to prove these bounds, possibly by induction on your input size.

To determine the complexity of a loop, this formula generally holds:

loopTime = (times loop was run) * (complexity of loop body).

Note that this doesn't hold for your code because of the GOTOs, which is why refactoring is highly recommended.

Joey Eremondi
  • 30,277
  • 5
  • 67
  • 122