language-icon Old Web
English
Sign In

De Bruijn sequence

In combinatorial mathematics, a de Bruijn sequence of order n on a size-k alphabet A is a cyclic sequence in which every possible length-n string on A occurs exactly once as a substring (i.e., as a contiguous subsequence). Such a sequence is denoted by B(k, n) and has length kn, which is also the number of distinct substrings of length n on A; de Bruijn sequences are therefore optimally short. There are ( k ! ) k n − 1 k n {displaystyle {dfrac {left(k! ight)^{k^{n-1}}}{k^{n}}}} distinct de Bruijn sequences B(k, n). In combinatorial mathematics, a de Bruijn sequence of order n on a size-k alphabet A is a cyclic sequence in which every possible length-n string on A occurs exactly once as a substring (i.e., as a contiguous subsequence). Such a sequence is denoted by B(k, n) and has length kn, which is also the number of distinct substrings of length n on A; de Bruijn sequences are therefore optimally short. There are ( k ! ) k n − 1 k n {displaystyle {dfrac {left(k! ight)^{k^{n-1}}}{k^{n}}}} distinct de Bruijn sequences B(k, n). The sequences are named after the Dutch mathematician Nicolaas Govert de Bruijn. According to him, the existence of de Bruijn sequences for each order together with the above properties were first proved, for the case of alphabets with two elements, by Camille Flye Sainte-Marie in 1894, whereas the generalization to larger alphabets is originally due to Tatyana van Aardenne-Ehrenfest and himself. In most applications, A = {0,1}. The earliest known example of a de Bruijn sequence comes from Sanskrit prosody where, since the work of Pingala, each possible three-syllable pattern of long and short syllables is given a name, such as 'y' for short–long–long and 'm' for long–long–long. To remember these names, the mnemonic yamātārājabhānasalagām is used, in which each three-syllable pattern occurs starting at its name: 'yamātā' has a short–long–long pattern, 'mātārā' has a long–long–long pattern, and so on, until 'salagām' which has a short–short–long pattern. This mnemonic, equivalent to a de Bruijn sequence on binary 3-tuples, is of unknown antiquity, but is at least as old as Charles Philip Brown's 1869 book on Sanskrit prosody that mentions it and considers it 'an ancient line, written by Pāṇini'. In 1894, A. de Rivière raised the question in an issue of the French problem journal L'Intermédiaire des Mathématiciens, of the existence of a circular arrangement of zeroes and ones of size 2 n {displaystyle 2^{n}} that contains all 2 n {displaystyle 2^{n}} binary sequences of length n {displaystyle n} . The problem was solved (in the affirmative), along with the count of 2 2 n − 1 − n {displaystyle 2^{2^{n-1}-n}} distinct solutions, by Camille Flye Sainte-Marie in the same year. This was largely forgotten, and Martin (1934) proved the existence of such cycles for general alphabet size in place of 2, with an algorithm for constructing them. Finally, when in 1944 Kees Posthumus conjectured the count 2 2 n − 1 − n {displaystyle 2^{2^{n-1}-n}} for binary sequences, de Bruijn proved the conjecture in 1946, through which the problem became well-known. Karl Popper independently describes these objects in his The Logic of Scientific Discovery (1934), calling them 'shortest random-like sequences'. The de Bruijn sequences can be constructed by taking a Hamiltonian path of an n-dimensional de Bruijn graph over k symbols (or equivalently, an Eulerian cycle of an (n − 1)-dimensional de Bruijn graph). An alternative construction involves concatenating together, in lexicographic order, all the Lyndon words whose length divides n. An inverse Burrows—Wheeler transform can be used to generate the required Lyndon words in lexicographic order.

[ "Graph", "Combinatorics", "Discrete mathematics", "Algebra", "De Bruijn graph", "De Bruijn–Newman constant", "POPLmark challenge", "de bruijn erdős theorem", "de bruijn digraph" ]
Parent Topic
Child Topic
    No Parent Topic