Generating a number using a coin

Question

What is the most practical procedure to generate a random (decimal) $n$-digit integer using a coin?

By "the most practical procedure" I mean a procedure that is quick and as simple and easy to carry out in real life as possible.

There is another question I asked that is more general and abstract. This time, I feel that the answer is going to be completely different, and again – after hours of thinking, I haven't been able to come up with a solution. Googling doesn't yield a satisfactory answer, either.

score 1 · Answer 1 · 2016-11-25T09:07:25.333

Roll the dice $4n$ times, form a binary integer with the outcomes and multiply it by $10^n/2^{4n}$.

There will be a little bias on the distribution of the last digit, I guess. You can reduce it using more than $4n$ drawings.

The now deleted suggestion by @5xum is better. Generate an $m$ bit binary number and retry if it exceeds $10^n-1$. For efficiency, take the smallest $m$ that fits,

$$m=\left\lceil n\frac{\log10}{\log2}\right\rceil.$$

On average you will need $m$ drawings times the average number of attempts, equal to $1/(1-p)$ where $p$ is the probability of exceeding $10^n-1$ (this is a geometric law).

As $$p=1-\frac{10^m}{2^n},$$ generating a number takes

$$\frac{m2^m}{10^n}$$ drawings.

For instance, with $n=8$, $m=27$ we have $p=0.255$ and close to $36.24$ drawings ($4.53$ per digit).

I guess that you can avoid the retries (hence achieve the optimal $\log10/\log2\approx3.322$ drawings per digit on average) by considering a very long sequence of $km$ bits and converting it to base $10^n$ (giving $kn$ digits). I wouldn't be surprised that this can be implemented by storing the last $m$ generated bits only.

score 0 · Answer 2 · answered Nov 25 '16 at 08:28

Possibly to generate a sequence of 0 and 1 (firstly find an upper bound for the number of digits) and do something with the extra (too long) decimal numbers. What to do depends on the distribution. The easiest way is to try again, but it can be non-effective in some cases.

score 0 · Answer 3 · answered Nov 25 '16 at 08:32

This isn't very efficient, but a simple way to do this would be to assign each decimal digit to a unique string of heads and tails of length 4 (for example, 0=HHHH, 1=HHHT, 2=HHTH, etc.). Then, to generate each digit, flip the coin 4 times. If the result does not correspond to a digit, flip the coin another 4 times, and repeat until the result is achieved. This takes $1.6 \cdot 4 = 6.4$ flips to generate each digit on average.

score 0 · Answer 4 · answered Nov 25 '16 at 09:38

Throw the coin ten times and convert the resulting $0$-$1$-string to decimal. If the result is $>999$ repeat the process, which will be necessary in $2.4\%$ of the cases. In this way you obtain three decimals per $10$ throws of the coin with very little waste.

Dominik · Answer 5 · 2016-11-25T13:15:30.767

I have the following idea:

Assume $X_1, X_2, \ldots$ are an i.i.d sequence of random variables with $P(X_1 = 0) = P(X_1 = 1) = \frac{1}{2}$. Then the random variable $X = \sum \limits_{i = 1}^\infty X_i 2^{-i}$ is distributed uniformly on the interval $[0, 1]$. This follows from $P(X \le \frac{k}{2^n}) = \frac{k}{2^n}$ for any positive integers $n$ and $k \le 2^n$.

Now for a random variable $X \sim U([0, 1])$ we know that $Y := (b - a)X + a \sim U([a, b])$ and we know $\lfloor Y \rfloor \sim U(\{a, \ldots, b - 1\})$. This means we have

$$Z = \left\lfloor 10^m \sum \limits_{i = 1}^\infty X_i 2^{-i}\right\rfloor \sim U(\{0, \ldots, 10^m - 1\}).$$

Now it is obviously not very efficient to throw infinitely many coins. But for a "typical" sequence of coin flips we will know the value of $Z$ already after having thrown only finitely many coins.

For example, consider $m = 1$ and assume our first four random variables are zero. We then know $$10 \sum \limits_{i = 1}^\infty X_i 2^{-i} = 10 \sum \limits_{i = 5}^\infty X_i2^{-i} \le 10 \sum \limits_{i = 5}^\infty 2^{-i} = \frac{10}{16} < 1,$$ from which we can deduce that $Z = 0$.

Now the difficult part is to determine after how many throws (on average) we know which value $Z$ assumes. I couldn't quite determine this, but maybe someone else has a good idea. My guess is that this method is (on average) slightly more efficient than the other methods that have been presented as answers.

Edit: Let $K$ be the minimal number of throws we need with this method. Then we know that $K \ge k$ iff there is an integer $1 \le r < 10^m$ that satisfies $$10^m \left(\sum \limits_{i = 1}^k X_i 2^{-i} + 2^{-k}\right) > r > 10^m \left(\sum \limits_{i = 1}^k X_i 2^{-i}\right)$$

Now this is equivalent to $$\begin{align*} &\sum \limits_{i = 1}^k X_i 2^{k - i} + 1 > \frac{2^k r}{10^m} > \sum \limits_{i = 1}^k X_i 2^{k - i} \\ \iff{}& \sum \limits_{i = 0}^{k - 1} X_{k - i} 2^i + 1 > \frac{2^k r}{10^m} > \sum \limits_{i = 0}^{k - 1} X_{k - i} 2^i \end{align*}$$

Now this is the case iff $\frac{2^k r}{10^m}$ is not an integer and also $$\left\lfloor\frac{2^k r}{10^m}\right\rfloor = \sum \limits_{i = 0}^{k - 1} X_{k - i} 2^i \qquad (\star)$$ holds.

Now to get an upper estimate on $P(K \ge k)$ we can ignore the first condition (this would only make things more difficult for a miniscule improvement of our bound). Let us take a look at the event $(\star)$. The sum on the right-hand side is simply a binary representation of a (uniform) random number from $0$ to $2^k - 1$. This means the probability of $(\star)$ is equal to $$P_k := 2^{-k} \sum \limits_{i = 0}^{2^k - 1} I\left\{i = \left\lfloor \frac{2^kr}{10^m}\right\rfloor \text{ for some } 0 < r < 10^m\right\}.$$

So how many numbers from $0$ to $2^k - 1$ can be written in this form? If $\frac{2^k}{10^m} \le 1$, we can write all numbers. However, for $\frac{2^k}{10^m} > 1$ each $r$ yields a different number. This means we get $P_k = 1$ for $k \le m \log_2(10)$ and $P_k = \frac{10^m - 1}{2^k}$ for $k > m \log_2(10)$.

Writing $c = \lfloor m \log_2(10)\rfloor$ we get the following: $$\begin{align*} E[K] &= \sum \limits_{k = 1}^\infty P(K \ge k) \le \sum \limits_{k = 1}^\infty P_k \le \sum \limits_{k = 1}^{c}1 + \sum \limits_{k = c + 1}^\infty \frac{10^m - 1}{2^k} \\ &= c + (10^m - 1) 2^{-c} < \lfloor m \log_2(10)\rfloor + 10^m 2^{-(m \log_2(10) - 1)} \le \lfloor m \log_2(10) \rfloor + 2 \end{align*}$$

Summarizing this method:

Let $X_n$ be the result of the $n$-th coin flip (either a $0$ or a $1$). After each coin flip, calculate the number $Z_n = 10^m\sum \limits_{i = 1}^n X_i 2^{-i}$. If $\lfloor Z_n\rfloor = \lceil Z_n + 10^m2^{-n}\rceil$ holds, then you are done and your number is given by $\lfloor Z_n\rfloor$. Otherwise, continue adding numbers until the condition holds.

The average number of coin flips of this procedure is less than $\lfloor m \log_2(10)\rfloor + 2$.

To put this into perspective: If we flip $r$ coins, we can generate a maximum of $2^r$ different values. This means that for a uniform distribution on $\{0, \ldots, 10^m - 1\}$ we need $2^r \ge 10^m$, i.e. $r \ge m \log_2(10)$. Since $\log_2(10)$ is irrational and $r$ needs to be an integer, we will always need at least $\lceil m \log_2(10) \rceil = \lfloor m \log_2(10) \rfloor + 1$ coin flips. This method will on average use only one more coin flip than this (theoretical) limit.

Generating a number using a coin

5 Answers5

Linked