14

I'm currently writing some code to generate binary data. I specifically need to generate 64-bit numbers with a given number of set bits; more precisely, the procedure should take some $0 < n < 64$ and return a pseudo-random 64-bit number with exactly $n$ bits set to $1$, and the rest set to 0.

My current approach involves something like this:

  1. Generate a pseudorandom 64-bit number $k$.
  2. Count the bits in $k$, storing the result in $b$.
  3. If $b = n$, output $k$; otherwise go to 1.

This works, but it seems inelegant. Is there some kind of PRNG algorithm which can generate numbers with $n$ set bits more elegantly than this?

Pseudonym
  • 24,523
  • 3
  • 48
  • 99
Koz Ross
  • 823
  • 7
  • 12

6 Answers6

12

What you need is a random number between 0 and ${ 64 \choose n } - 1$. The problem then is to turn this into the bit pattern.

This is known as enumerative coding, and it's one of the oldest deployed compression algorithms. Probably the simplest algorithm is from Thomas Cover. It's based on the simple observation that if you have a word that is $n$ bits long, where the set bits are $x_k \ldots x_1$ in most-significant bit order, then the position of this word in the lexicographic ordering of all words with this property is:

$$\sum_{1 \le i \le k} { x_i \choose i}$$

So, for example, for a 7-bit word:

$$i(0000111) = { 2 \choose 3 } + {1 \choose 2 } + {0 \choose 1} = 0$$ $$i(0001011) = { 3 \choose 3 } + {1 \choose 2 } + {0 \choose 1} = 1$$ $$i(0001101) = { 3 \choose 3 } + {2 \choose 2 } + {0 \choose 1} = 2$$

...and so on.

To get the bit pattern from the ordinal, you just decode each bit in turn. Something like this, in a C-like language:

uint64_t decode(uint64_t ones, uint64_t ordinal)
{
    uint64_t bits = 0;
    for (uint64_t bit = 63; ones > 0; --bit)
    {
        uint64_t nCk = choose(bit, ones);
        if (ordinal >= nCk)
        {
            ordinal -= nCk;
            bits |= 1 << bit;
            --ones;
        }
    }
    return bits;
}

Note that since you only need binomial coefficients up to 64, you can precompute them.


  • Cover, T., Enumerative Source Encoding. IEEE Transactions on Information Theory, Vol IT-19, No 1, Jan 1973.
Pseudonym
  • 24,523
  • 3
  • 48
  • 99
3

Very similar to Pseudonym's answer, obtained by other means.

The total number of available combinations is approachable by the stars and bars method, so it will have to be $c=\binom{64}{n}$. The total number of 64-bit numbers from which you would be trying to sample your number would be obviously much higher than that.

What you need then is a function that can lead you from a pseudorandom number $k$, ranging from $1$ to $c$, to the corresponding 64-bit combination.

Pascal's triangle can help you with that, because every node's value represents exactly the number of paths from that node to the root of the triangle, and every path can be made to represent one of the strings you are looking for, if all left turns are labeled with a $1$, and every right turn with a $0$.

So let $x$ be the number of bits left to determine, and $y$ be the number of ones left to use.

We know that $\binom{x}{y}=\binom{x-1}{y}+\binom{x-1}{y-1}$, and we can use it to properly determine the next bit of the number at each step:

$\mathtt{while}\;\;\; x>0$

$\quad \mathtt{if}\;\;\; x>y$

$\qquad \mathtt{if}\;\;\;k>\binom{x-1}{y}: \;\;\;s \leftarrow s\; + \mathtt{"1"}, \;k\leftarrow k-\binom{x-1}{y}, \;y \leftarrow y-1$

$\qquad \mathtt{else}:\; \;s \leftarrow s\; + \mathtt{"0"}$

$\quad \mathtt{else}: \;\;s \leftarrow s\; + \mathtt{"1"}, \;y \leftarrow y-1$

$\quad x \leftarrow x-1$

André Souza Lemos
  • 3,296
  • 1
  • 16
  • 30
2

Another quite elegant method is to use bisection as described in this stackoverflow answer. The idea is to keep two words, one known to have at most k bits set and one known to have at least k bits set, and use randomness to move one of these towards having exactly k bits. Here is some source code to illustrate it:

word randomKBits(int k) {
    word min = 0;
    word max = word(~word(0)); // all 1s
    int n = 0;
    while (n != k) {
        word x = randomWord();
        x = min | (x & max);
        n = popcount(x);
        if (n > k)
            max = x;
        else
            min = x;
    }
    return min;
}

I made a performance comparison of various methods and this one is typically the fastest unless k is known to be very small.

Falk Hüffner
  • 236
  • 1
  • 2
1
  1. Take any fixed 64-bit number with $n$ bits set, such as $2^n-1$.

  2. Apply the Fisher–Yates shuffle: for each $i=63,\dots,0$, swap the $i$th bit with the $j$th bit, where $j$ is a uniformly chosen number $0\le j\le i$.

Emil Jeřábek
  • 3,668
  • 17
  • 20
1

You can do the following :

1) Generate random number,$k$ between $1$ and $64$.

2) Set $k$ th $0$ to $1$.

3)Repeat steps 1 and 2 $n$ times

$A[]$ is $64$ bit array with all $0$s

for(i=1 to n)
{
    k=ran(1,65-i) % random number between 1 and 65-i
    for(x=1;x<65;x++)
    {
        if(A[x]==0)k--;
        if(k==0)break;
    }
    A[x]=1;
}
User Not Found
  • 498
  • 2
  • 11
0

How about setting a random bit, if it is not already set, until k bits are set?