Example: Merge-Sort on Linear Lists

When merge-sort is employed to sort an array of elements, it requires additional memory proportionate to the size of the array in order to move the elements around, which is considered a significant weakness of merge-sort. However, merge-sort does not have this requirement when it operates on a linear list. I present as follows an implementation of merge-sort on linear lists that can readily rival its counterpart in C in terms of both time-efficiency and memory-efficiency. The invariants captured in this implementation and the easiness to capture them should provide strong evidence that attests to ATS being a programming language capable of enforcing great precision in practical programming.

Let us first introduce a type definition and an interface for a function template that compares elements in lists to be sorted:

// typedef cmp (a:t@ype) = (&a, &a) -> int // fun{a:t@ype} compare (x: &a, y: &a, cmp: cmp(a)): int //

The interface for merge-sort is given as follows:

fun{ a:t@ype } mergeSort{n:nat} (xs: list_vt (a, n), cmp: cmp a): list_vt (a, n) // end of [mergeSort]

The first argument of mergeSort is a linear list (to be sorted) and the second one a function for comparing the elements in the linear list. Clearly, the interface of mergeSort indicates that mergeSort consumes its first argument and then returns a linear list that is of the same length as its first argument. As is to become clear, the returned linear list is constructed with the nodes of the consumed one. In particular, the implementation of mergeSort given as follows does not involve any memory allocation or deallocation.

The function template for merging two sorted lists into one is given as follows:

fun{ a:t@ype } merge{m,n:nat} .<m+n>. ( xs: list_vt (a, m), ys: list_vt (a, n) , res: &List_vt(a)? >> list_vt (a, m+n) , cmp: cmp a ) : void = case+ xs of | @list_vt_cons (x, xs1) => ( case+ ys of | @list_vt_cons (y, ys1) => let val sgn = compare<a> (x, y, cmp) in if sgn <= 0 then let // stable sorting val () = res := xs val xs1_ = xs1 prval () = fold@ (ys) val () = merge<a> (xs1_, ys, xs1, cmp) in fold@ (res) end else let val () = res := ys val ys1_ = ys1 prval () = fold@ (xs) val () = merge<a> (xs, ys1_, ys1, cmp) in fold@ (res) end // end of [if] end (* end of [list_vt_cons] *) | ~list_vt_nil () => (fold@ (xs); res := xs) ) // end of [list_vt_cons] | ~list_vt_nil () => (res := ys) // end of [merge]

Unlike the one given in a previous functional implementation, this implementation of merge is tail-recursive and thus is guaranteed to be translated into a loop in C by the ATS compiler. This means that the concern of merge being unable to handle very long lists (e.g., containing 1 million elements) due to potential stack overflow is eliminated.

The next function template is for splitting a given linear lists into two:

fun{ a:t@ype } split{n,k:nat | k <= n} .<n-k>. ( xs: &list_vt (a, n) >> list_vt (a, n-k), nk: int (n-k) ) : list_vt (a, k) = if nk > 0 then let val+@list_vt_cons(_, xs1) = xs val res = split<a> (xs1, nk-1); prval () = fold@(xs) in res end else let val res = xs; val () = xs := list_vt_nil () in res end // end of [if] // end of [split]

Note that the implementation of split is also tail-recursive.

The following function template msort takes a linear list, its length and a comparsion function, and it returns a sorted version of the given linear list:

fun{ a:t@ype } msort{n:nat} .<n>. ( xs: list_vt (a, n), n: int n, cmp: cmp(a) ) : list_vt (a, n) = if n >= 2 then let val n2 = half(n) val n3 = n - n2 var xs = xs // lvalue for [xs] val ys = split<a> (xs(*cbr*), n3) val xs = msort<a> (xs, n3, cmp) val ys = msort<a> (ys, n2, cmp) var res: List_vt (a) // uninitialized val () = merge<a> (xs, ys, res(*cbr*), cmp) in res end else xs // end of [msort]

The second argument of msort is passed so that the length of the list being sorted does not have to be computed directly by traversing the list when each recursive call to msort is made.

Finally, mergeSort can be implemented with a call to msort:

implement{a} mergeSort (xs, cmp) = msort<a> (xs, length (xs), cmp)

By inspecting the implementation of split and merge, we can readiy see that mergeSort performs stable sorting, that is, it preserves the order of equal elements during sorting.

Please find on-line the entirety of the code presented in this section plus some additional code for testing.