A Short Cut to Deforestation
Andrew Gill John Launchbury Simon L Peyton Jones
Department of Computing Science, University of Glasgow G12 8QQ
This paper is to appear in FPCA 1993.
Lists are often used as glue" to connect separate parts of a program together. We propose an automatic technique for improving the efficiency of such programs, by removing many of these intermediate lists, based on a single, simple, local transformation. We have implemented the method in the Glasgow Haskell compiler.
Functional programs are often constructed by combining together smaller programs, using an intermediate list to communicate between the pieces. For example, the function all, which tests whether all the elements of a list xs satisfy a given predicate p may be defined as follows (we use the language Haskell throughout this paper (Hudak et al. )):
all p xs = and (map p xs)
Here, p is applied to all the elements of the list xs, producing an intermediate list of booleans. These booleans are anded" together by the function and, producing a single boolean result. The intermediate list is discarded, and eventually recovered by the garbage collector.
This compositional style of programming is one reason why lists tend to be so pervasive, despite the availability of userdefined types. Functional languages support this tendency by supplying a large library of pre-defined list-manipulating functions, and by supporting special syntax for lists, such as list comprehensions. Then, because so many functions are already available to manipulate lists, it is easy to define new functions which work on lists, and so on.
Unfortunately, all these intermediate lists give rise to an efficiency problem. Each of their cons cells has to be allocated, filled, taken apart, and finally deallocated, all of which consumes resources. There are more efficient versions of all
which do not use intermediate lists. For example, it can be re-written like this:
all' p xs = h xs
where h  = True
h (x:xs) = p x && h xs
Now no intermediate list is used, but this has been achieved at the cost of clarity and conciseness compared with the original definition.
We want to eat our cake and have it too. That is, we would like to write our programs in the style of all, but have the compiler automatically transform this into the more efficient version all'.
One example of just such a transformation is deforestation (Wadler ). Deforestation removes arbitrary intermediate data structures (including lists), but suffers from some major drawbacks (Section 2). As a result, apart from a prototype incorporated into a local version of the Chalmers LML compiler (Davis ), we know of no mature compiler that uses deforestation as part of its regular optimisations.
In this paper we present a cheap and easy way of eliminating many intermediate lists that does not suffer from these drawbacks. Our optimisation scheme is based on a single, simple, local transformation, and is practical for inclusion in a real compiler. It has the following characteristics:
ffl The technique applies to most of the standard list processing functions. Examples are functions that consume lists, such as and and sum, expressions that create lists, such as [x..y], and functions that both consume and create lists, such as map, filter, ++ and the like. In general, the technique handles any compositional list-consuming function, that is, one which can be written using foldr.
ffl The technique extends straightforwardly to improve
the state of the art in compiling list comprehensions
(Section 4). Standard compilation techniques for list
comprehensions build an intermediate list in expressions
[ f x | x <- map g xs, odd x ]
Our technique automatically transforms this expression to a form that uses no intermediate lists. Furthermore, we are able to use the same technology to