Thursday, April 03, 2008

Lexicographic permutations using Algorithm L (STL next_permutation in Python)

One of the more useful functions in the C++ Standard Library is next_permutation in <algorithm>. The STL function has a desirable property that almost every other permutation generating functions I’ve seen lack, namely lexicographic awareness of the elements being permuted.

A typical function will, given a sequence of elements such as (1, 1, 2, 2), permute on indices only. This will in our case give 4! permutations, which is often not what we want. The STL implementation will “correctly” generate only unique permutations, in our case 4! / 2!2!, and also generate them in the right order.

What’s special about most STL implementations is the use of a fairly unknown algorithm for finding permutations in lexicographic order. A canonical templated implementation is usually about 25 lines of code. It is also non-recursive and very fast.

Here’s a Python implementation of next_permutation with user-defined comparison. Use freely.

def next_permutation(seq, pred=cmp):
    """Like C++ std::next_permutation() but implemented as
    generator. Yields copies of seq."""

    def reverse(seq, start, end):
        # seq = seq[:start] + reversed(seq[start:end]) + \
        #       seq[end:]
        end -= 1
        if end <= start:
            return
        while True:
            seq[start], seq[end] = seq[end], seq[start]
            if start == end or start+1 == end:
                return
            start += 1
            end -= 1
    
    if not seq:
        raise StopIteration

    try:
        seq[0]
    except TypeError:
        raise TypeError("seq must allow random access.")

    first = 0
    last = len(seq)
    seq = seq[:]

    # Yield input sequence as the STL version is often
    # used inside do {} while.
    yield seq
    
    if last == 1:
        raise StopIteration

    while True:
        next = last - 1

        while True:
            # Step 1.
            next1 = next
            next -= 1
            
            if pred(seq[next], seq[next1]) < 0:
                # Step 2.
                mid = last - 1
                while not (pred(seq[next], seq[mid]) < 0):
                    mid -= 1
                seq[next], seq[mid] = seq[mid], seq[next]
                
                # Step 3.
                reverse(seq, next1, last)

                # Change to yield references to get rid of
                # (at worst) |seq|! copy operations.
                yield seq[:]
                break
            if next == first:
                raise StopIteration
    raise StopIteration

Example:

>>> for p in next_permutation([1, 1, 2, 2]):
 print p,

 
[1, 1, 2, 2] [1, 2, 1, 2] [1, 2, 2, 1] [2, 1, 1, 2] [2, 1, 2, 1] [2, 2, 1, 1]

An important question is: how does the code work, and why? Not surprisingly, the man with the answers is Donald Knuth. This algorithm doesn’t appear to have a proper name, so Knuth simply calls it Algorithm L. This algorithm is described in The Art of Computer Programming, Volume 4, Fascicle 2: Generating All Tuples and Permutations. Algorithm L works like this, using a slightly different explanation than Knuth that I hope will be easier to understand.

Algorithm L: Given a sequence of n elements a0, a1, …, an-1 generate all permutations of the sequence in lexicographically correct order.

Step 1 (Step L2 in Knuth): Partition the sequence into two sequences a0, a1, …, aj and aj+1, aj+2, …, an-1 such that we have already generated all permutations beginning with a0, a1, …, aj. This can by done by decreasing j from n-2 until aj < aj+1. If j = 0 we are done.

For example, the input sequence 1, 4, 3, 2 is split into the sequence 1 and the sequence 4, 3, 2. Obviously, there are no more lexicographic permutations beginning with 1 when the second sequence is in decreasing order.

Step 2a (Step L3 in Knuth): In the second sequence aj+1, aj+2, …, an-1 working backwards, find am, the first value larger than aj. We find am by setting m to n-1 and decreasing until aj < am.

Step 2b: Swap aj and am.

For example, our two sequences are 1 and 4, 3, 2. As the second sequence is decreasing because of the first step, am is the smallest element greater than aj that can legitimately follow a0, a1, …, aj-1 in a permutation.

Step 3 (Step L4 in Knuth): Reverse aj+1, aj+2, …, an-1.

Here are some examples of the steps on (1, 2, 3, 4) that should clarify step 2a, 2b and 3:

(1, 2, 3, 4) >> (1, 2, 3), (4) >> (1, 2, 4), (3) >> (1, 2, 4), (3) >> (1, 2, 4, 3)
(1, 2, 4, 3) >> (1, 2), (4, 3) >> (1, 3), (4, 2) >> (1, 3), (2, 4) >> (1, 3, 2, 4)
(1, 3, 2, 4) >> (1, 3, 2), (4) >> (1, 3, 4), (2) >> (1, 3, 4), (2) >> (1, 3, 4, 2)
(1, 3, 4, 2) >> (1, 3), (4, 2) >> (1, 4), (3, 2) >> (1, 4), (2, 3) >> (1, 4, 2, 3)

Here are some examples on the sequence (1, 2, 2, 3):

(1, 2, 2, 3) >> (1, 2, 2), (3) >> (1, 2, 3), (2) >> (1, 2, 3), (2) >> (1, 2, 3, 2)
(1, 2, 3, 2) >> (1, 2), (3, 2) >> (1, 3), (2, 2) >> (1, 3), (2, 2) >> (1, 3, 2, 2)
(1, 3, 2, 2) >> (1), (3, 2, 2) >> (2), (3, 2, 1) >> (2), (1, 2, 3) >> (2, 1, 2, 3)

Algorithm L is a fairly simple algorithm and it's also easy to understand and implement. Permutation generation is sometimes used as an interview question because it's difficult to get right even though the underlying problem is easy to grasp. It can thus be useful to know even for those not interested in combinatorics.