0

Let there be "N" bits.

We want to rank and unrank a specific subset of bit combinations based on the following criteria -

The number of consecutive 0s must be minimum "k" or more (k<N).

How can we rank (and unrank) only based on the given bit combination (and not by iterating the full range of 2^N bits). This will be efficient for large N bits.

I tried the following approach (example with N=5 and k=3) -

Start from LSB bit (position 1), and count the number of possible combinations with >=3 consecutive 0s:

Position 1 - 0 >=3 consecutive 0s
Position 2 - 0 >=3 consecutive 0s
Position 3 - 1 >=3 consecutive 0s
Position 4 - 3 >=3 consecutive 0s
Position 5 - 8 >=3 consecutive 0s

How can this information be used, along with bit-shifting patterns, to do the rank and unrank of a given bit combination.

This is another question where the ranking is based on "<=2" consecutive 0s; however what I now want is the opposite case, i.e. >=3 consecutive 0s.

Order in a subset


Example -

N: 5 bits
k: >=3 consecutive 0s

00000 - k>=3 - rank - 00001 - unrank - 00000 00001 - k>=3 - rank - 00010 - unrank - 00001 00010 - k>=3 - rank - 00011 - unrank - 00010 00011 - k>=3 - rank - 00100 - unrank - 00011 00100 00101 00110 00111 01000 - k>=3 - rank - 00101 - unrank - 01000 01001 01010 01011 01100 01101 01110 01111 10000 - k>=3 - rank - 00110 - unrank - 10000 10001 - k>=3 - rank - 00111 - unrank - 10001 10010 10011 10100 10101 10110 10111 11000 - k>=3 - rank - 01000 - unrank - 11000 11001 11010 11011 11100 11101 11110 11111

Thanks!

Dave
  • 25
  • 4

1 Answers1

1

Conceptually, you can apply the standard generic approach. Ranking: to compute the rank of a string $x$, you compute the number of valid strings $s$ such that $s \preceq x$, i.e., $s$ lexicographically precedes $x$ (or is equal to $x$).

Unranking: to construct a string of rank $r$, you use binary search on the string. Given any string $s$, you can compute its rank and check whether the rank of $s$ is $<r$ or $>r$; if it is $<r$, then you choose a lexicographically earlier string, otherwise you choose a lexicographically later string. By using binary search (splitting the strings in half at each step), you obtain an efficient procedure for unranking, once you know how to do ranking.

So all that remains is to apply this to your particular case, and specifically, to come up with a ranking algorithm. This is conceptually easy. Build a deterministic finite-state machine $M$ that recognizes strings that follow your desired pattern. Build another $M'$ that recognizes strings that lexicographically precede $x$. Use the product construction to compute a deterministic finite-state machine $M''$ that accepts the intersection of those two languages, i.e., $L(M'')=L(M) \cap L(M')$. Finally, count the number of strings accepted by $M''$; see Why isn't it simple to count the number of words in a regular language? and https://cstheory.stackexchange.com/q/8200/5038 for an algorithm to do that.

If you work through the details of this, you will obtain an algorithm that solves your problem, and scales well for large $N$.

D.W.
  • 167,959
  • 22
  • 232
  • 500