Chapter 8. Functional List-Processing (2)

Sometimes, a list-processing function is partial in the sense that it is not well-defined for all of the lists. For instance, the function list0_head for returning the head element of a given list is defined only if the given list is non-empty:

fun {a:t@ype} list0_head(xs: list0(a)): a = ( case+ xs of | list0_cons(x, _) => x | _(* list0_nil() *) => $raise ListSubscriptExn() )

Note that the code $raise ListSubscriptExn() is for raising an exception (in the case where xs is empty), which is to be explained in details later. Another approach to handling a partial function is to turn it into a total one that returns options. For instance, list0_head_opt is such a total function corresponding to list0_head:

// datatype option0(a:t@ype) = Some0 of (a) | None0 of () // fun {a:t@ype} list0_head_opt (xs: list0(a)): option0(a) = ( case+ xs of list0_cons(x, _) => Some0(x) | _ => None0() )

The function returning the last element of a given list is also partial as it is only defined for a non-empty list. The following function list0_last_opt returns an option to indicate whether the last element of a given list is found:

// fun {a:t@ype} list0_last_opt ( xs: list0(a) ) : option0(a) = let // fun loop (x0: a, xs: list0(a)): a = ( case+ xs of list0_nil() => x0 | list0_cons(x1, xs) => loop(x1, xs) ) // in // case+ xs of | list0_nil() => None0() | list0_cons(x, xs) => Some0(loop(x, xs)) // end // end of [list0_last_opt] //

Note that the inner function loop in the body of list0_last_opt is tail-recursive.

Before moving on, I would like to point out a very common mistake in (functional) list-processing. First and foremost, a list is not meant to be used like an array. The following (partial) function list0_get_at does the so-called list-subscripting:

// extern fun {a:t@ype} list0_get_at (xs: list0(a), n: int): a // implement {a}(*tmp*) list0_get_at (xs, n) = ( case+ xs of | list0_nil() => $raise ListSubscriptExn() | list0_cons(x, xs) => if n <= 0 then x else list0_get_at<a>(xs, n-1) ) //

For instance, given the list (1,10,100) and the index 1, list0_get_at return the element 10. Another common name for list0_get_at is list0_nth. Clearly, the time-complexity of list0_get_at is O(n) (while array-subscripting is O(1)-time). It is almost always a poor programming style to process the elements in a list by calling list0_get_at (as the resulting code is likely to be prohibitively inefficient time-wise). For the purpose of illustration, let us take a look at the following two functions for tallying the integers contained in a given list:

// fun list0_tally1 (xs: list0(int)): int = list0_foldleft<int><int>(xs, 0, lam(res, x) => res + x) // fun list0_tally2 (xs: list0(int)): int = int_foldleft<int> (list0_length(xs), 0, lam(res, i) => res + list0_get_at<int>(xs, i)) //

Given a list xs of length n, the function list0_tally1 is O(n)-time while the function list0_tally2 is O(n2)-time. List-processing should be done in the efficient style of list0_tally1 rather than in the terribly inefficient style of list0_tally2. In general, please try to avoid doing list-subscripting repeatedly if the involved index cannot be bounded by a small constant!

Let us see more functions for performing functional list-processing in the following presentation and an example of functional programming at the end that illustrates some typical use of list-processing functions.

A commonly used (higher-order) function is often referred to as list-map, which takes a list and a function and returns a newly constructed list consisting of all of the elements obtained from applying the function to each element in the given list:

// extern fun {a:t@ype} {b:t@ype} list0_map (xs: list0(a), fopr: cfun(a, b)): list0(b) // implement {a}{b} list0_map ( xs, fopr ) = auxlst(xs) where { // fun auxlst (xs: list0(a)): list0(b) = ( case+ xs of | list0_nil() => list0_nil() | list0_cons(x, xs) => list0_cons(fopr(x), auxlst(xs)) ) // } (* end of [list0_map] *) //

For instance, given the list (1, 2, 3, 4, 5) and the integer square function, list0_map returns the list consisting of 1, 4, 9, 16, and 25.

With list0_map, we can readily build list0_cross as follows for computing the cross product of two given lists:

// extern fun {a,b:t@ype} list0_cross (xs: list0(a), ys: list0(b)): list0($tup(a, b)) // implement {a,b}(*tmp*) list0_cross (xs, ys) = let // typedef ab = $tup(a, b) // in // list0_concat // for concatenating a list of lists ( list0_map<a><list0(ab)> (xs, lam(x) => list0_map<b><ab>(ys, lam(y) => $tup(x, y))) ) (* end of [list0_concat] *) // end // end of [list0_cross] //

For instance, given two lists (0, 1) and (2, 3, 4), list0_cross returns a newly constructed list consisting of the following six pairs: (0, 2), (0, 3), (0, 4), (1, 2), (1, 3), and (1, 4).

A function rather similar to list0_map is list0_foreach, which takes a list and a procedure (i.e., a function returning void) and applies the procedure to each element in the list (so as to create some effects):

// extern fun {a:t@ype} list0_foreach (xs: list0(a), fwork: cfun(a, void)): void // implement {a}(*tmp*) list0_foreach ( xs, fwork ) = loop(xs) where { // fun loop (xs: list0(a)): void = ( case+ xs of | list0_nil() => () | list0_cons(x, xs) => (fwork(x); loop(xs)) ) // } (* end of [list0_foreach] *) //

Another commonly used (higher-order) function is often referred to as list-filter, which takes a list and a predicate (i.e., a function returning a boolean value) and returns a newly constructed list consisting of all of the elements in the given list that satisfy the given predicate:

// extern fun {a:t@ype} list0_filter (xs: list0(a), test: cfun(a, bool)): list0(a) // implement {a}(*tmp*) list0_filter ( xs, test ) = auxlst(xs) where { // fun auxlst (xs: list0(a)): list0(a) = ( case+ xs of | list0_nil() => list0_nil() | list0_cons(x, xs) => if test(x) then list0_cons(x, auxlst(xs)) else auxlst(xs) // end of [if] ) // } (* end of [list0_filter] *) //

For instance, given the list (1, 2, 3, 4, 5) and the predicate for testing whether an integer is even, list0_filter returns the list consisting of 2 and 4.

Given a list xs, the following function list0_remdup removes all of the elements in xs that have already appeared previously:

// extern fun {a:t@ype} list0_remdup (xs: list0(a), eqfn: cfun(a, a, bool)): list0(a) // implement {a}(*tmp*) list0_remdup(xs, eqfn) = ( case+ xs of | list0_nil() => list0_nil() | list0_cons(x0, xs) => list0_cons(x0, list0_remdup<a>(list0_filter<a>(xs, lam(x) => ~eqfn(x0, x)), eqfn)) ) //

The implementation of list0_remdup should clearly remind one of the sieve of Eratosthenes, which is to be given a stream-based implementation later.

The list-processing function list0_map processes every element in its list argument and so does list0_filter. Sometimes, we need a list-processing function that stops immediately after certain condition is met. For instance, we may want to locate the index of the first element in a given list that satisfies some test, which can done by calling the following function list0_find_index:

// extern fun {a:t@ype} list0_find_index (xs: list0(a), test: cfun(a, bool)): int // implement {a}(*tmp*) list0_find_index (xs, test) = let // fun loop (xs: list0(a), i: int): int = ( case+ xs of | list0_nil() => ~1 | list0_cons(x, xs) => if test(x) then i else loop(xs, i+1) ) // in loop(xs, 0) end // end of [list0_find_index] //

For instance, given the list (1, 2, 3) and the predicate for testing whether an integer is even, list0_find_index returns 1 (which is the index for the element 2 in the given list). Note that ~1 (negative 1) is returned if no element satisfying the given predicate is found.

Given a list0-value xs and a predicate test, list0_exist returns true if and only if there exists one element in xs satisfing test, and list0_forall returns true if and only if every element in xs satisfies test. Both of these two functions can be readily implemented based on a direct call to list0_find_index:

// extern fun {a:t@ype} list0_exists (xs: list0(a), test: cfun(a, bool)): bool extern fun {a:t@ype} list0_forall (xs: list0(a), test: cfun(a, bool)): bool // implement {a}(*tmp*) list0_exists(xs, test) = list0_find_index<a>(xs, test) >= 0 implement {a}(*tmp*) list0_forall(xs, test) = list0_find_index<a>(xs, lam(x) => not(test(x))) < 0 //

Given a list (x0, ..., xn-1) of length n, its indexed version refers to the list of pairs in which the elements are of the form (i, xi) for i ranging from 0 to n-1. For a function that processes a given list, there is often a meaningful variant of the function that processes the indexed version of the list. For instance, the following function list0_imap is such a variant of list0_map:

// extern fun {a:t@ype} {b:t@ype} list0_imap (xs: list0(a), fopr: cfun(int, a, b)): list0(b) // implement {a}{b} list0_imap ( xs, fopr ) = auxlst(0, xs) where { // fun auxlst (i: int, xs: list0(a)): list0(b) = ( case+ xs of | list0_nil() => list0_nil() | list0_cons(x, xs) => list0_cons(fopr(i, x), auxlst(i+1, xs)) ) // } (* end of [list0_imap] *) //

Let us use list0_iexists and list0_iforall to refer to the variants of list0_exists and list0_forall, respectively, for processing indexed lists. The code implementing list0_iexists and list0_iforall is omitted for brevity.

I have so far presented a variety of functions for processing lists in a functional style. I would like to conclude the chapter with an example of functional programming that can concretely demonstrate some typical use of list-processing functions in practice.

The famous 8-queen puzzle asks the player to find ways to put eight queen pieces on a chess board such that no queen piece can attack any other ones. In other words, no two queen pieces can be put on the same row, the same column, or the same diagnal. This puzzle can be readily solved with a tree-based search. Let us introduce an abstract type as follows:

abstype node = ptr

where ptr indicates that node is boxed. Intuitively, a node represents a partial solution where there are n queen pieces (for some n less than or equal to 8) on the first n rows of the chess board such that no one piece can attack any other pieces. Given a node, another node extending it with one more queen piece is considered its child. The following declared function node_get_children is supposed to be called to obtain all of the child nodes of a given node:

extern fun node_get_children(node): list0(node) overload .children with node_get_children

With node_get_children, we can readily implement node_dfsenum as follows for enumerating in the depth-first manner all of the nodes contained in the tree rooted at a given node:

// extern fun node_dfsenum(node): list0(node) // implement node_dfsenum (nx0) = list0_cons ( nx0 , list0_concat<node> ( list0_map<node><list0(node)>(nx0.children(), lam(nx) => node_dfsenum(nx)) ) (* list0_concat *) ) (* node_dfsenum *) //

In order to find all of the solutions to the 8-queen puzzle, we just need to keep all of the nodes of length 8 that are contained in the tree rooted at the initial node (representing the empty partial soloution):

// extern fun QueenSolve(): list0(node) // #define N 8 // implement QueenSolve() = list0_filter<node>(node_dfsenum(node_init()), lam(nx) => node_length(nx) = N) //

where node_init returns the initial node and node_length returns the length of a node (that is, the number queen pieces in the partial solution represented by the node). By treating node as list0(int), we can implement node_init and node_length as follows:

// assume node = list0(int) // implement node_init() = list0_nil() // implement node_length(nx) = list0_length(nx) //

A node is represented as a list of integers; the length of the list refers to the number of queen pieces on the chess board; the integers (between 0 and N-1) in the list refer to the reversely listed column positions of the queen pieces. The function node_get_children is implemented as follows:

// fun test_safety ( xs: list0(int) ) : bool = let // val- list0_cons(x0, xs) = xs // in // list0_iforall<int> // abs: absolute value (xs, lam(i, x) => (x0 != x && abs(x0-x) != (i+1))) // end // end of [test_safety] // implement node_get_children (nx) = list0_filter<node> (int_list0_map<node> (N, lam(x) => list0_cons(x, nx)), lam(nx) => test_safety(nx) ) //

The function test_safety checks whether a column position is safe for the next queen piece. Note that applying int_list0_map to an integer n is like applying list0_map to the list consisting of all of the integers between 0 and n-1, inclusive.

The presented code for solving 8-queen puzzle is of the kind of high-level functional programming style I intend to advocate throughout the book. It should soon be clear that we can reap even more benefits from programming in this way when (linear) lazy streams are used in place of functional lists.

Please find on-line the entirety of the code used in this chapter.