Example: Mergesort on Lists

Mergesort is a simple sorting algorithm that is guaranteed to be log-linear. It is stable in the sense that the order of two equal elements always stay the same after sorting. I give as follows a typical functional style of implementation of mergesort on lists.

First, let us introduce abbreviations for the list constructors list0_nil and list0_cons:

#define :: list0_cons // writing [::] for list0_cons #define cons0 list0_cons // writing [cons0] for list0_cons #define nil0 list0_nil // writing [nil0] for list0_nil

Note that the operator :: is already given the infix status. For instance, the list consisting of the first 5 natural numbers can be constructed as follows:

cons0(0, cons0(1, 2 :: 3 :: 4 :: nil0((*void*))))

In practice, there is of course no point in mixing cons0 with ::.

We next implement a function template merge to merge two given ordered lists into a single ordered one:

// typedef lte (a:t@ype) = (a, a) -> bool // fun{ a:t@ype } merge ( xs: list0 a, ys: list0 a, lte: lte a ) : list0 a = ( case+ xs of | cons0 (x, xs1) => ( case+ ys of | cons0 (y, ys1) => if x lte y then cons0{a}(x, merge<a> (xs1, ys, lte)) else cons0{a}(y, merge<a> (xs, ys1, lte)) // end of [if] | nil0 () => xs ) // end of [cons0] | nil0 () => ys ) (* end of [merge] *) //

For instance, suppose that the two given lists are (1, 3, 4, 8) and (2, 5, 6, 7, 9), and the comparison function (the third argument of merge) is the standard less-than-or-equal-to function on integers. Then the list returned by merge is (1, 2, 3, 4, 5, 6, 7, 8, 9). The syntax lte means that the particular occurrence of lte following the backslash symbol () is given the infix status, and thus the expression x lte y means the same as lte(x, y).

The following function template mergesort implements the standard mergesort algorithm:

fun{ a:t@ype } mergesort ( xs: list0 a, lte: lte a ) : list0 a = let // val n = list0_length<a> (xs) // fun msort ( xs: list0 a, n: int, lte: lte a ) : list0 a = if n >= 2 then split (xs, n, lte, n/2, nil0) else xs // and split ( xs: list0 a, n: int, lte: lte a, i: int, xsf: list0 a ) : list0 a = if i > 0 then let val-cons0 (x, xs) = xs in split (xs, n, lte, i-1, cons0{a}(x, xsf)) end else let val xsf = list0_reverse<a> (xsf) // make sorting stable! val xsf = msort (xsf, n/2, lte) and xs = msort (xs, n-n/2, lte) in merge<a> (xsf, xs, lte) end // end of [if] // in msort (xs, n, lte) end // end of [mergesort]

Suppose we want to sort the list (8, 3, 4, 1, 2, 7, 6, 5, 9); we first divide it into two lists: (8, 3, 4, 1) and (2, 7, 6, 5, 9); by performing mergesort on each of them, we obtain two ordered lists: (1, 3, 4, 8) and (2, 5, 6, 7, 9); by merging these two ordered list, we obtain the ordered list (1, 2, 3, 4, 5, 6, 7, 8, 9), which is a permutation of the originally given list (8, 3, 4, 1, 2, 7, 6, 5, 9).

Note that the function template merge is not tail-recursive as the call to merge in its body is not a tail-call. This can be a serious problem in practice: It is almost certain that a stack overflow is to occur if the above implementation of mergesort is employed to sort a list that is very long (e.g., containing 1,000,000 elements or more). I will later give a tail-recursive implementation of the merge function in ATS that makes use of linear types.

Please find the entire code in this section plus some additional code for testing on-line.