'Lazy Catalan Numbers in Haskell

How might I go about efficiently generating an infinite list of Catalan numbers? What I have now works reasonably quickly, but it seems to me that there should be a better way.

c 1 = 1
c n = sum (zipWith (*) xs (reverse xs)) : xs
    where xs = c (n-1)

catalan = map (head . c) [1..]

I made an attempt at using fix instead, but the lambda isn't lazy enough for the computation to terminate:

catalan = fix (\xs -> xs ++ [zipWith (*) xs (reverse xs)])

I realize (++) isn't ideal

Does such a better way exist? Can that function be made sufficiently lazy? There's an explicit formula for the nth, I know, but I'd rather avoid it.



Solution 1:[1]

The Catalan numbers [wiki] can be defined inductively with:

C0 = 1 and Cn+1=(4n+2)×Cn/(n+2).

So we can implement this as:

catalan :: Integral i => [i]
catalan = xs
    where xs = 1 : zipWith f [0..] xs
          f n cn = div ((4*n+2) * cn) (n+2)

For example:

Prelude> take 10 catalan
[1,1,2,5,14,42,132,429,1430,4862]

Solution 2:[2]

I'm guessing you're looking for a lazy, infinite, self-referential list of all the Catalan numbers using one of the basic recurrence relations. That's a common thing to do with the Fibonacci numbers after all. But it would help to specify the recurrence relation you mean, if you want answers to your specific question. I'm guessing this is the one you mean:

cat :: Integer -> Integer
cat 1 = 1
cat n = sum [ cat i * cat (n - i) | i <- [1 .. n - 1] ]

If so, the conversion to a self-referential form looks like this:

import Data.List (inits)

cats :: [Integer]
cats = 1 : [ sum (zipWith (*) pre (reverse pre)) | pre <- tail (inits cats) ]

This is quite a lot more complex than the fibonacci examples, because the recurrence refers to all previous entries in the list, not just a fixed small number of the most recent. Using inits from Data.List is the easiest way to get the prefix at each position. I used tail there because its first result is the empty list, and that's not helpful here. The rest is a straight-forward rewrite of the recurrence relation that I don't have much to say about. Except...

It's going to perform pretty badly. I mean, it's better than the exponential recursive calls of my cat function, but there's a lot of list manipulation going on that's allocating and then throwing away a lot of memory cells. That recurrence relation is not a very good fit for the recursive structure of the list data type. You can explore a lot of ways to make it more efficient, but they'll all be pretty bad in the end. For this particular case, going to a closed-form solution is the way to go if you want performance.

Solution 3:[3]

Apparently, what you wanted is

> cats = 1 : unfoldr (\ fx -> let x = sum $ zipWith (*) fx cats in Just (x, x:fx)) [1]

> take 10 cats
[1,1,2,5,14,42,132,429,1430,4862]

This avoids the repeated reversing of the prefixes (as in the linked answer), by unfolding with the state being a reversed prefix while consing onto the state as well as producing the next element.

The non-reversed prefix we don't have to maintain, as zipping the reversed prefix with the catalans list itself takes care of the catalans prefix's length.

You did say you wanted to avoid the direct formula.

Solution 4:[4]

The Catalan numbers are best understood by their generating function, which satisfies the relation

f(t) = 1 + t f(t)^2

This can be expressed in Haskell as

f :: [Int]
f = 1 : convolve f f

for a suitable definition of convolve. It is helpful to factor out convolve, for many other counting problems take this form. For example, a generalized Catalan number enumerates ternary trees, and its generating function satisfies the relation

g(t) = 1 + t g(t)^3

which can be expressed in Haskell as

g :: [Int]
g = 1 : convolve g (convolve g g)

convolve can be written using Haskell primitives as

convolve :: [Int] -> [Int] -> [Int]
convolve xs = map (sum . zipWith (*) xs) . tail . scanl (flip (:)) []

For these two examples and many other special cases, there are formulas that are quicker to evaluate. convolve is however more general, and cognitively more efficient. In a typical scenario, one has understood a counting problem in terms of a polynomial relation on its generating function, and now wants to compute some numbers in order to look them up in The On-Line Encyclopedia of Integer Sequences. One wants to get in and out, indifferent to complexity. What language will be least fuss?

If one has seen the iconic Haskell definition for the Fibonacci numbers

fibs :: [Int]
fibs = 0 : 1 : zipWith (+) fibs (tail fibs)

then one imagines there must be a similar idiom for products of generating functions. That search is what brought me here.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Willem Van Onsem
Solution 2 Carl
Solution 3 Will Ness
Solution 4 Syzygies