Formal statement of property of randomness of a sequence

Question

Suppose we have a probability space $(\Omega,{\mathscr F},P)$ consisting of

An arbitrary nonempty set $\Omega$
A collection ${\mathscr F}$ of subsets of $\Omega$ which is also a $\sigma$-algebra on subsets of $\Omega$
A probability measure $P: {\mathscr F} \rightarrow [0,1]$

I'm reading a text which shows how to generate a sequence of numbers $X_1,X_2,\ldots,X_n$ such that the distribution of the sequence is in $U(0,1)$ and "random" in some sense. That's two properties:

Lies in distribution. This can be established by a "convergence in distribution" test $\lim_{n\rightarrow\infty}F_{X_n}(x) = F_X(x)$ where $F_{X_n}$ is the empirical CDF of the sequence and $F_X$ is the theoretical CDF of the probability space.
Sequence is "random". There is a large literature on the notion of randomness and a very small and ad hoc literature on practical testing of sequences for randomness. Maybe what I am looking for is Martin-Löf randomness.

One version of the Martin-Löf randomness definition is that "A sequence is Martin-Löf random if and only if no constructive martingale succeeds on it."

Q1. What is considered the "best practice" to formally state property 2? See for example this paper.

Q2. The Martin-Löf definition in Wikipedia is independent of the probability space. Is this correct? In general, is the randomness of a sequence independent of the probability distribution it converges to?

One possible answer, adapted from Wikipedia. It's not as constructive as I'd like, and it doesn't reference a particular probability distribution, which seems also undesirable:

Let $S = (X_1,\ldots, X_n, Y_1, \ldots, Y_m)$ be a sequence comprised of sequence $X_i$ followed by sequence $Y_j$. Let us define a martingale $d:\Omega^\ast \rightarrow [0,\infty)$ such that for all sequences $S$, $d(S) = \frac{1}{2} (d(X) + d(Y))$. A martingale is said to succeed on $S$ if $\lim_{n\rightarrow\infty} d(S_1,\ldots,S_n) = \infty$. A martingale is said to be constructive if there exists a computable function ${\hat d}: \Omega^\ast \times {\mathbb N} \rightarrow {\mathbb Q}$ such that, for all $S$ for all $t>0$, ${\hat d}(S,t) \leq {\hat d}(S,t+1) < d(S)$. A sequence is Martin-Löf random if and only if no constructive martingale succeeds on it.

That is, the above is not constructive in the sense that it doesn't supply an explicit test for randomness of a particular realized sequence with respect to a particular probability distribution. The Wikipedia page on tests of randomness lists a number of fairly ad hoc criteria, nothing that looks mathematically rigorous.

Also I just found an equivalent question here.

Another related criterion is whether or not the sequence is incompressible. This test however would reject a single sequence which happens to be compressible. The implication is that tests of randomness would need to work on multiple samples of a random sequence generator to arrive at a conclusion, not a single output. So in that view, if the outputs are highly incompressible on average, then the random sequence generator is good (one can imagine a distribution of compressibility where a certain shape of the realized distribution corresponds to "random"). This program uses compressibility tests to assess the randomness of a sequence.

In the end what I'm concerned about in Q2 is tests of randomization. The most satisfying test that occurred to me is to generate many sequences from the RNG, then plot the empirical PDF of the compression ratio of the sequences, i.e. take a good compression algorithm and divide original sequence size over compressed size. If that clusters around 1, then you have good randomization.

Q1, tests of distribution fit, has many accepted tests. Just Q2 is less talked about.

The linked question has answers and discussion that are indeed relevant. There are different uses of the word "random", and your question seems to use several of them. If the book is describing an algorithm for generating numbers, that's a pseudorandom number generating algorithm (PRNG), and the generated sequence cannot be Martin-Löf random, because M-L random implies that there is no way to generate the sequence that is shorter than the sequence itself, and PRNGs are by definition very succinct ways to generate sequences. — Mars, Nov 19 '19 at 22:44
PRNGs are kind-of sort-of good-enough-for-a-purpose random, and they are tested with statistical tests of the same sort that are used in frequentist statistical methods in science. M-L randomness is randomness of a sequence itself. PRNGs are tested by testing their sequences, but part of what matters are not particular sequences but the generating process itself. In this sense, testing a PRNG is like testing some real random generating process in the world. You'll have to choose what kind of randomness to study. For M-L randomness, "algorithmic complexity" is another good search string. — Mars, Nov 19 '19 at 22:48
One more comment. While ML randomness is randomness of a sequence--perhaps infinite--and PRNG tests are tests of a generator, ML randomness is also defined by statistical tests. A PRNG is good if it passes a large number of well-known statistical tests for independence between combinations of r.v.s. Except that there are not distinct r.v.s. Roughly, a sequence is ML random if it passes all possible statistical tests for the outputs being from independent combinations of r.v.s. Except that in the algorithmic randomness/complexity world, there are no r.v.s; there is only the sequence. — Mars, Nov 19 '19 at 22:50
Thanks. The Kloeden-Platen-Schurz book has a visual test: scatter plot successive pairs and look for patterns. That's obviously not satisfying, but it works for linear congruential generators. The most satisfying test that occurred to me is to generate many sequences from the RNG, then plot the empirical PDF of the compression ratio of the sequences, i.e. take a good compression algorithm and divide original sequence size over compressed size. If that clusters around 1, then you have good randomization. That is independent of desired distribution, where other tests are needed. — Lars Ericson, Nov 20 '19 at 12:36
I have edited my comments a little bit and copied them into my answer. — Mars, Nov 21 '19 at 04:57

Mars · Answer 1 · 2019-11-21T05:44:35.523

There are different uses of the word "random". My linked answer to another question, linked below, is about some of them. If the book you mentioned is describing an algorithm for generating numbers--it sounds like it is--that's a pseudorandom number generating algorithm (PRNG), and the generated sequence cannot be Martin-Löf random, because M-L randomness implies that there is no way to generate the sequence that is shorter than the sequence itself. PRNGs are by definition very succinct ways to generate sequences.

This is a similar question, and my answer provides some literature references on Martin-Löf randomness and philosophical discussions of randomness. The Volchan paper you mentioned is another source; I have seen it but have not read it. (I don't consider your question a duplicate, since your question is much more detailed.)

One criterion for randomness of a finite sequence, which is extremely closely related to Martin-Löf randomness (I'm being vague to avoid making subtle mistakes), due to Chaitin and Kolmogorov, is roughly that a finite sequence is random if it can't be generated by an algorithm that's shorter than the sequence itself (e.g. using a Turing machine). You can test for randomness by running through all possible programs that are shorter than the sequence. (Not very efficient!) This won't work for infinite sequences, so for that you need Martin-Löf's definition. (This is point is something I am studying currently, so I am repeating what I've read rather than drawing upon my own understanding.)

For information on testing of PRNGs, good starting points include the following:

A visual test like the one you described can be useful, but it's just a starting point. Likewise for ent. I would recommend consulting sources like those above rather than trying to come up with your own tests. L'Ecuyer's site includes the TestU01 suite of PRNG tests, which incorporates many PRNG tests (including tests like those described by Knuth). L'Ecuyer's paper with Simard describing TestU01 is very useful. There are a few other test suites that may be worth using as well.

Additional details, partly based on my comments:

PRNGs are kind-of sort-of good-enough-for-a-purpose random, and they are tested with statistical tests of the same sort that are used in frequentist statistical methods in science. No formal definition is possible.

A PRNG is good if it passes a large number of well-known statistical tests for independence between combinations of r.v.'s. That is, it's good if, according to statistical tests, the sequence of numbers looks like a sequence that would be generated by independent r.v.'s. In this sense, testing a PRNG is like testing some real random process realized in the world, except that in the case of a PRNG, there are not distinct r.v.'s.; there is just the one algorithm.

Roughly, a sequence is ML random if it passes all possible statistical tests for the outputs being from independent combinations of r.v.s. Except that in the algorithmic randomness/complexity world, there are no r.v.s, and there is no generator; there is only the sequence.

That is, although M-L randomness is based on something like PRNG testing on steroids (or PRNG testing is the poor cousin of how M-L randomness is defined), there is a difference in what is actually being tested--a sequence itself, in one case, and a sequence-generator in the other.

Finally, about $U(0,1)$ and probability spaces: Most PRNGs are designed to emulate independent trials of $U(0,1)$ r.v.'s. This makes testing easier, and it's an obvious choice. There are methods for transforming the output so that it will emulate other distributions, e.g. in this book. Some books or articles define M-L randomness only for uniform distributions on binary sequences (but this has implications for integers in other bases or for rationals and reals). Sometimes M-L randomness is defined directly for uniform distributions on numbers in arbitrary bases (see the Calude book). Sometimes M-L randomness is defined for arbitrary probability distributions--not just uniform distributions--on binary numbers (as in Li and Vitanyi--but most of the examples use uniform distributions).

I would say that M-L randomness is independent of a probability space in one sense, in that there's no need to characterize trials in terms of a probability space. There is a probability space, though. It's just that it's usually implicit. The algebra is a product algebra over a finite alphabet (usually $\{0,1\}$), and the probability distribution is usually a uniform distribution. With infinite sequences, my understanding is that this can be understood in terms of a Cantor space and Lebesgue measure (but I am still learning about this, and I may have misstated the point). But look at Li and Vitanyi's definition of a Martin-Löf test in their chapter 2 (at least in the 3rd or 4th ed); they represent the probability distribution over binary sequences explicitly.

(The "equivalent" question to which you linked has answers and discussion that are somewhat relevant. However, most of the answers are focused on unit testing, which I don't see as entirely relevant to your questions. You can decide.)

Can you comment on my proposed pragmatic solution for PRNGs, which is to generate many random sequence samples (say n=1000 of size m=10000 or pick your favorite n,m), compress each sequence sample with a good compressor, then plot the empirical CDF (normalized histogram) of m/compressed size, and look for clustering around 1 (no compression)? This both accomodates an RNG output which "looks nonrandom" but is still an output (like a string of 1's), while capturing the idea of random on average. This test is independent of distribution. I would accept an answer that addresses this test. — Lars Ericson, Nov 21 '19 at 14:06
Your original (Math.SE) question incorporated several different (content) questions. That's not ideal, but you were trying to figure out non-obvious distinctions between randomness concepts, so it was understandable and reasonable. My answer here and the one I linked to attempt to clarify and distinguish what had been conflated, and suggested resources for further study. Now you have more clarity and have an interesting, specific (content) question based on that clarity. It belongs in a separate Math.SE, Stats.SE, CS.SE, or Crypto.SE question. (PRNG questions don't have a natural home.) — Mars, Nov 21 '19 at 16:20
A rule of thumb that might be relevant: If a "good" compression algorithm compresses a string, it's nonrandom in an M-L sense. If the algorithm fails to compress the string, that's only prima facie evidence of randomness, since some other algorithm that is smarter or better tuned to that sort of string might do better. Similarly, the evidence for quality of a PRNG is only prima facie. (As an extreme illustration, consider the "compression" algorithm that simply copies a string. Relative to that algorithm, all strings are prima facie random.) This is one reason for suites of PRNG tests. — Mars, Nov 21 '19 at 16:24
I'm trying to get at randomness (not test of distributional convergence) for a U(0,1) PRNG. Any individual sequence of numbers, including say 1,1,1,1,1,1, is a valid output of the PRNG. This means that randomness has to be tested not on a single output but on a large population of outputs of the PRNG. It can only be defined as distributionally incompressible, i.e. most outputs are incompressible. This is the PRNG equivalent of Kolmogov criterion. This is so obvious that there must be a literature around this idea (refs?) . If not it sounds like a good PhD thesis topic. — Lars Ericson, Nov 21 '19 at 17:22
I.e. something in this genre: https://www2.seas.gwu.edu/~hoeteck/pubs/compress-ccc04.pdf — Lars Ericson, Nov 21 '19 at 17:25

Lars Ericson · Answer 2 · 2019-11-27T18:13:27.863

Stating my intuition here.

Let $(\Omega,{\mathscr F},P)$ be a probability space.

Let $X^1,\ldots,X^m$ be a set of $m$ random sequences of length $n$ in $P$, so that $X^i=(X_1^i,\ldots,X_n^i)$.

Let ${\mathscr F}^\ast$ be a compressed representation of events in ${\mathscr F}$. Let invertible $C:{\mathscr F}\rightarrow {\mathscr F}^\ast$ be a compression function. Let $R:{\mathscr F}^\ast \in [0,1]$ be a compression ratio function which gives the ratio of the compressed size to the uncompressed size.

Let ${\textrm epdf}(S)$ be the normalized histogram or empirical PDF of elements of set $S$.

Then $X$ is pseudorandom in $P$ if

${\textrm epdf} (\cup_{i=1}^m \cup_{j=1}^n X_j^i)$ converges in distribution to $P$
The ${\textrm epdf}(\{R(x): x \in X\})$ clusters around 1.

To be concrete regarding point 2, let's do an experiment on the compression ratio for 1000 sequences of size 50 Python 3 $U(0,1)$ random numbers:

%matplotlib inline
from matplotlib.pylab import *
import scipy.stats as st
import zlib, json

def C(X):
    Xstar=bytes(json.dumps(X.tolist()), 'UTF-8')
    return (Xstar, zlib.compress(Xstar))

def R(CX):
    return len(CX[1])/len(CX[0]) 

D=st.norm(0,1)
(m,n)=(1000, 50)
X=D.rvs((m,n))
RX=[R(C(x)) for x in X ]
hist(RX, normed=True,bins=50);

The picture we get is:

This looks like a Skellam distribution for $k=0$ with a mean around 0.495. I would have expected it to be skewed to the right with a center closer to 1. Either way the picture is evocative and interesting.

Formal statement of property of randomness of a sequence

2 Answers2

Linked