Find the longest repeated pattern in a string

Question

I'm looking for an efficient algorithm to find the longest repeated pattern in a string.

For example, consider the following string of numbers:

5431428571428571428571428571427623874534.

As you can see, 142857142857 is the longest pattern which is repeated for a couple of times (at least twice) in this string.

The repeated string should not contain any re any idea rather than brute-force?

score 16 · Answer 1 · edited Feb 12 '19 at 14:16

The problem is surprisingly non-trivial. First, two brute force algorithms. A square ("repeated pattern") is given by its length $\ell$ and position $p$, and takes time $O(\ell)$ to verify. If we go over all $\ell$ and $p$, we obtain an $O(n^3)$ algorithm. We can improve on that by first looping over $\ell$, and then scanning the string with two running pointers at a distance of $\ell$. In this way, one can verify whether a square of length $2\ell$ exists in linear time, giving a total running time of $O(n^2)$.

Kolpakov and Kucherov developed an algorithm for finding all maximal repeats in a word in time $O(n)$ [1], and their algorithm can be used to find all maximal squares in time $O(n)$. A repeat is a subword of the form $w^kx$, where $k \geq 2$ and $x$ is a proper prefix of $w$. The largest square contained in that repeat is $(w^{\lfloor k/2 \rfloor})^2$. Using this formula, given all maximal repeats in a word (of which there are only $O(n)$ many), one can find the largest square.

[1] Kolpakov, R., & Kucherov, G. (1999). Finding maximal repetitions in a word in linear time. In Foundations of Computer Science, 1999. 40th Annual Symposium on (pp. 596-604). IEEE.

Find the longest repeated pattern in a string

1 Answers1

Linked