Linear Lists

A linear list is essentially the same as a singly-linked list depicted by the dataview slist_v. However, memory allocation and deallocation of list-nodes not handled previously are handled this time. The following declaration introduces a linear datatype list_vt, which forms a boxed type (of the sort viewtype) when applied to a type and an integer:

datavtype list_vt (a:t@ype+, int) = | list_vt_nil (a, 0) of () | {n:nat} list_vt_cons (a, n+1) of (a, list_vt (a, n)) // end of [list_vt]

Given a type T and an integer I, list_vt(T, I) is for linear lists of length I in which each element is of the type T.

Assume that a data constructor named foo is associated with a dataviewtype. Then there is a corresponding viewtype construtor of the name foo_unfold that takes n+1 addresses to form a viewtype, where n is the arity of foo. For instance, there is a viewtype constructor list_vt_cons_unfold that takes 3 address L0, L1 and L2 to form a viewtype list_vt_cons_unfold(L0, L1, L2). This viewtype is for a list-node created by a call to list_vt_cons such that the node is located at L0 and the two arguments of list_vt_cons are located at L1 and L2 while the proofs for the at-views associated with L1 and L2 are put in the store for currently available proofs.

The following function template length computes the length of a given linear list:

fn{ a:t@ype } length{n:nat} (xs: !list_vt (a, n)): int n = let fun loop {i,j:nat | i+j==n} .<i>. (xs: !list_vt (a, i), j: int j): int (n) = case+ xs of | list_vt_cons (_, xs1) => loop (xs1, j+1) | list_vt_nil () => j // end of [loop] in loop (xs, 0) end // end of [length]

The interface of length indicates that length<T> returns an integer equal to I when applied to a list of the type list_vt(T, I), where T and I are a type and an integer, respectively. Note that the symbol ! in front of the type of a function argument indicates that the argument is call-by-value and it is preserved after a call to the function. The function loop inside the body of length is tail-recursive. Given a linear list and an integer, loop returns the sum of the length of the list and the integer. In the body of loop, if xs matches the pattern list_vt_cons(_, xs1), then the name xs1 is bound to the tail of xs. Note that xs1 is a value (instead of a variable), and it is not allowed that xs1 be modified into another value (of a different type).

Suppose that we do want to modify the content stored in a list-node. For instance, we may want to double the value of each integer stores in a linear integer list. The following code implements a function named list_vt_2x that does precisely this:

fun list_vt_2x{n:nat} (xs: !list_vt (int, n) >> _): void = ( case+ xs of | @list_vt_cons (x, xs1) => let val () = x := 2 * x val () = list_vt_2x (xs1) prval () = fold@ (xs) in // nothing end // end of [list_vt_cons] | list_vt_nil () => () ) (* end of [list_vt_2x] *)

Given a type T, the notation (!T >> _) is a shorthand for (!T >> T). Note that the special symbol @ in front of the pattern list_vt_cons(x, xs1) means unfolding. If xs matches this pattern, then x and xs1 are bound to the pointers pointing to some memory locations L1 and L2 where the head and tail of xs are stored, respectively, and the type of xs changes into list_vt_cons_unfold(L0, L1, L2) for L0 being the location of the list-node referred to by xs. In the body of the clause guarded by the pattern list_vt_cons(x, xs1), x and xs1 are treated as variables (which are a form of left-value). The special proof function fold@ is called on xs to fold it plus the proofs of at-views attached to L1 and L2 into a linear list.

Let us now see an example where linear list-nodes are explicitly freed:

fun{ a:t@ype } list_vt_free {n:nat} .<n>. (xs: list_vt (a, n)): void = ( case+ xs of | ~list_vt_cons (x, xs1) => list_vt_free (xs1) | ~list_vt_nil ((*void*)) => () ) (* end of [list_vt_free] *)

Given a linear list, the function list_vt_free frees all the nodes in the list. Let us go over the body of list_vt_free carefully. If xs matches the pattern list_vt_cons(x, xs1), then the names x and xs1 are bound to the head and tail of xs, respectively; the special symbol ~ in front of the pattern indicates that the list-node referred to by xs is freed immediately after xs matches the pattern. If xs matches the pattern list_vt_nil(), no bindings are generated; the special symbol ~ in front of the pattern indicates that the list-node referred to by xs is freed; nothing in this case is actually freed at run-time as list_vt_nil is mapped to the null pointer.

It is also possible to use the special function free@ to explicitly free a node (also called a skeleton) left in a linear variable after the variable matches a pattern formed with a constructor associated with some dataviewtype. For instance, the following code gives another implementation of list_vt_free:

fun{ a:t@ype } list_vt_free {n:nat} .<n>. (xs: list_vt (a, n)): void = case+ xs of | @list_vt_cons (x, xs1) => let val xs1_ = xs1 // [xs1_] is the value stored in [xs1] val ((*void*)) = free@{a}{0}(xs) in list_vt_free (xs1_) end // end of [list_vt_cons] | @list_vt_nil () => free@{a} (xs) // end of [list_vt_free]

As it can be a bit tricky to use free@ in practice, I present more details as follows. First, let us note that the constructors list_vt_nil and list_vt_cons associated with list_vt are assigned the following types:

list_vt_nil : // one quantifier
  {a:t@ype} () -> list_vt (a, 0)
list_vt_cons : // two quantifiers
  {a:t@ype}{n:nat} (a, list_vt (a, n)) -> list_vt (a, n+1)

If free@ is applied to a node of the type list_vt_nil(), it needs one static argument, which is a type, to instantiate the quantifier in the type of the constructor list_vt_nil. If free@ is applied to a node of the type list_vt_cons_unfold(L0, L1, L2), then it needs two static arguments, which are a type and an integer, to instantiate the two quantifiers in the type of the constructor list_vt_cons. In the case where the type of xs is list_vt_cons_unfold(L0, L1, L2), typechecking the call free@{a}{0}(xs) implicitly consumes a proof of the at-view a?@L1 and another proof of the at-view list_vt(a, 0)?@L2. As there is no difference between list_vt(T, 0)? and list_vt(T, I)? for any T and I, the static argument 0 is chosen in the code. As a matter of fact, any natural number can be used in place of 0 as the second static argument of free@.

Linear List-Reversing

The following code implements a function reverse that turns a given linear list into its reverse:

fn{ a:t@ype } reverse{n:nat} ( xs: list_vt (a, n) ) : list_vt (a, n) = let fun revapp {i,j:nat | i+j==n} .<i>. ( xs: list_vt (a, i), ys: list_vt (a, j) ) : list_vt (a, n) = case+ xs of | @list_vt_cons (_, xs1) => let val xs1_ = xs1 val () = xs1 := ys prval () = fold@ (xs) in revapp (xs1_, xs) end // end of [list_vt_cons] | ~list_vt_nil ((*void*)) => ys // end of [revapp] in revapp (xs, list_vt_nil) end // end of [reverse]

The type assigned to reverse indicates that the function returns a linear list of the same length as the one it consumes. Note that the inner function revapp is tail-recursive. This implementation of linear list-reversing directly corresponds to the one presented previously that is based the dataview slseg_v (for singly-linked list segments). Comparing the two implementations, we can see that the one based on dataviewtype is significantly simplified at the level of types. For instance, there is no explicit mentioning of pointers in the types assigned to functions reverse and revapp.

Linear List-Appending

The following code implements a function append that concatenates two given linear lists into one:

fn{ a:t@ype } append{m,n:nat} ( xs: list_vt (a, m) , ys: list_vt (a, n) ) : list_vt (a, m+n) = let fun loop {m,n:nat} .<m>. // [loop] is tail-recursive ( xs: &list_vt (a, m) >> list_vt (a, m+n), ys: list_vt (a, n) ) : void = case+ xs of | @list_vt_cons (_, xs1) => let val () = loop (xs1, ys) in fold@ (xs) end // end of [list_vt_cons] | ~list_vt_nil ((*void*)) => xs := ys // end of [loop] var xs: List_vt (a) = xs // creating a left-value for [xs] val () = loop (xs, ys) in xs end // end of [append]

As the call fold@(xs) in the body of the function loop is erased after typechecking, loop is a tail-recursive function. Therefore, append can be called on lists of any length without the concern of possible stack overflow. The type for the first argument of loop begins with the symbol &, which indicates that this argument is call-by-reference. The type of loop simply means that its first argument is changed from a list of length m into a list of length m+n while its second argument is consumed.

This implementation of list append essentially corresponds to the one presented previously that is based on the dataview slseg_v. Comparing these two, we can easily see that the above one is much simpler and cleaner, demonstrating concretely some advantage of dataviewtypes over dataviews.

This is also a good place for me to mention a closely related issue involving (functional) list construction and tail-recursion. Following is a typical implementation of functioal list concatenation:

fun{ a:t@ype } append1{m,n:nat} ( xs: list(a, m), ys: list(a, n) ) : list (a, m+n) = case+ xs of | list_nil() => ys | list_cons(x, xs) => list_cons(x, append1<a>(xs, ys)) // end of [append1]

Clearly, append1 is not tail-recursive, which means that it may cause stack overflow at run-time if its first argument is very long (e.g., containing 1 million elements). There is, however, a direct and type-safe way in ATS to implement functional list concatenation in a tail-recursive manner, thus eliminating the concern of potential stack overflow. For instance, the following implementation of append2 returns the concatenation of two given functional lists while being tail-recursive:

fun{ a:t@ype } append2{m,n:nat} ( xs: list (a, m), ys: list (a, n) ) : list (a, m+n) = let // fun loop {m,n:nat} .<m>. ( xs: list (a, m), ys: list (a, n) , res: &(List a)? >> list (a, m+n) ) : void = ( case+ xs of | list_nil() => (res := ys) // end of [list_nil] | list_cons(x, xs) => let val () = res := list_cons{a}{0}(x, _) val+ list_cons (_, res1) = res val () = loop (xs, ys, res1) prval ((*void*)) = fold@ (res) in // nothing end // end of [list_cons] ) (* end of [loop] *) // var res: List(a) val () = loop (xs, ys, res) // in res end // end of [append2]

During typechecking, the expression list_cons{a}{0}(x, _) is assigned the (linear) type list_cons_unfold(L0, L1, L2) for some addresses L0, L1 and L2 while a proof of the at-view a@L1 and another proof of the at-view list(a, 0)?@L2 are put into the store for currently available proofs. Note that the special symbol _ simply indicates that the tail of the newly constructed list is uninitialized. A partially initialized list of the type list_cons_unfold(L0, L1, L2) is guaranteed to match the pattern list_cons(_, res1), yielding a binding between res1 and the pointer pointing to L2 where the (possibly uninitialized) tail of the list is stored. When fold@ is called on a variable of the type list_cons_unfold(L0, L1, L2), it changes the type of the variable to list(T, N+1) by consuming a proof of the at-view T@L1 and another proof of the at-view list(T, N)@L2, where T and N are a type and an integer, respectively.

Summary

With dataviewtypes, we can largely retain the convenience of pattern matching associated with datatypes while supporting explicit memory management. Compared to dataviews, dataviewtypes are less general. However, if a dataviewtype can be employed to solve a problem, then the solution is often significantly simpler and cleaner than an alternative dataview-based one.