Bindlib
The Bindlib
library provides support for free and bound variables in the OCaml language. The main application is the representation of types with a binding structure (e.g., abstract syntax trees).
The Bindlib
library provides two type constructors for building abstract syntax trees: type 'a var
representing a free variable of type 'a
, and type ('a,'b) binder
representing a binding of a variable of type 'a
in a value of type 'b
. Intuitively, type ('a,'b) binder
can be thought of as 'a -> 'b
. Types 'a mvar
and ('a,'b) mbinder
are also provided for handling arrays of variables more efficiently.
type 'a mvar = 'a var array
Type of an array of variables of type 'a
.
Type of a binder for a variable of type 'a
into an element of type 'b
. In terms of higher order abstract syntax, it can be seen as 'a -> 'b
.
Type of a binder for an array of variables of type 'a
into an element of type 'b
. This representation is more efficient than a nesting of binders of type ('a,'b) binder
, but substitution can only be performed the whole array of variables at once.
As an example, a type representing the terms of the pure λ-calculus can be defined as follows using types 'a var
and ('a,'b) binder
.
type term =
| Var of term var
| Abs of (term, term) binder
| App of term * term
val subst : ( 'a, 'b ) binder -> 'a -> 'b
subst b v
substitutes the variable bound by b
with the value v
. This operation is very efficient.
val msubst : ( 'a, 'b ) mbinder -> 'a array -> 'b
msubst b vs
substitutes the variables bound by b
with the values vs
. This operation is very efficient. Note however that the length of the vs
array should match the arity of b
(as given by mbinder_arity b
). If it is not the case, the exception Invalid_argument "Bad arity in msubst"
is raised.
Coming back to our pure λ-calculus example, call-by-name evaluation can be defined as a simple recursive function using subst
.
let rec eval : term -> term = fun t ->
match t with
| App(f,a) ->
begin
match eval f with
| Abs(b) -> eval (subst b a)
| _ -> t
end
| _ -> t
new_var mkfree name
creates a new variable using a function mkfree
and a name
. The mkfree
function is used to inject variables in the type of the corresponding objects: it is a form of syntactic wrapper that is often defined to be a constructor (see the example below).
new_mvar mkfree names
creates an array of new variables using a function mkfree
(see new_var
) and an array of variable names
.
For our pure λ-calculus example, where variables have type term var
, the mkfree
function would simply be defined as follows.
let mkfree : term var -> term = fun x -> Var(x)
val name_of : 'a var -> string
name_of x
returns the name corresponding to variable x
. Note that this name is generally not safe for printing since names are not updated when a substitution is performed. For instance, variables may appear to have been captured in the text representation of a structure with variable bindings. To circumvent this issue, one must be particularly careful when converting binders into text, and rely on contexts (see type ctxt
below).
val names_of : 'a mvar -> string array
names_of xs
returns names corresponding to variables xs
. As when using name_of
, care should be taken when converting objects with bindings into a text representation.
unbind b
substitutes the binder b
using a fresh variable. The variable and the result of the substitution are returned. Note that the name of the fresh variable is based on that of the binder, but it is not guaranteed to be safe for printing (see new_var
and unbind_in
). However, there is no problem for other operations such as re-establishing the binding. When the fresh variable is created, the mkfree
function that is used is that that was specified when creating the variable that was originally bound by b
, at the time of its creation (see new_var
and bind_var
).
unbind2 f g
is similar to unbind f
, but it substitutes two binders f
and g
at once using the same fresh variable. The name of the variable is based on that of the binder f
. Similarly, the mkfree
syntactic wrapper that is used for the fresh variable is the one that was given for creating the variable that was bound to construct f
(see bind_var
and new_var
for details on this process). In particular, the use of unbind2
may lead to unexpected results if the binders f
and g
were not built using free variables created with the same mkfree
. Moreover, as with unbind
, the name of the fresh variable is not guaranteed to be safe for printing.
eq_binder eq f g
tests the equality between f
and g
. The binders are first substituted with the same fresh variable (using unbind2
), and eq
is called on the resulting values. Note that eq_binder
may not have the expected result if f
and g
were not built by binding variables with an identical mkfree
syntactic wrapper.
unmbind b
substitutes the multiple binder b
with fresh variables. This function is analogous to unbind
for binders. Note that the names used to create the fresh variables are based on those of the multiple binder. The syntactic wrapper (of mkfree
) that is used to build the variables is the one that was given when creating the multiple variables that were bound in b
(see new_mvar
and bind_mvar
). Moreover, note that as for unbind
, the names of the fresh variables may not be safe for printing.
unmbind2 f g
is similar to unmbind f
, but it substitutes two multiple binder f
and g
at once, using the same fresh variables. Note that the two binders must have the same arity. This function may have an unexpected results in some cases, and the fresh variables may have names that are not safe for printing (see unbind2
).
eq_mbinder eq f g
tests the equality of the two multiple binders f
and g
. They are substituted with the same fresh variables (using unmbind2
) and eq
is called on the resulting values. This function may not have the expected result in some cases, for reasons explained in the documentation of eq_binder
. It is safe to use this function on multiple binders with a different arity (they are considered different).
The unbind
function is often useful for term traversals. For instance, a function computing the size of a pure λ-term can be defined as follows.
let rec size : term -> int = fun t ->
match t with
| Var(_) -> 0
| Abs(b) -> let (_,t) = unbind b in
1 + size t
| App(t,u) -> 1 + size t + size u
Note however that it is not a good idea to define a term-printing function using a combination of unbind
and name_of
. Indeed, as explained in the documentation of name_of
, contexts must be used to ensure that displayed variable names are correct.
To obtain fast substitutions, a price must be paid at the construction of terms. Indeed, binders (i.e., element of type ('a,'b) binder
) cannot be defined directly. Instead, they are put together in the type 'a box
. It correspond to a term of type 'a
whose free variables may be bound.
Type of a term of type 'a
under construction. Using this representation, the free variables of the term can be bound easily.
box_var x
injects variable x
into the 'a box
type, so that it can be bound using bind_var
.
val box : 'a -> 'a box
box e
injects the value e
into the 'a box
type, assuming that it is closed. Thus, if e
contains variables, then they will not be considered free. This means that no variable of e
will be available for binding.
apply_box bf ba
applies the boxed function bf
to a boxed argument ba
inside the box
type. This function is used to build new expressions by applying a function with free variables to an argument with free variables (the 'a box
type is an applicative functor whose application operator is apply_box
, and whose unit is box
).
box_apply f ba
applies the function f
to a boxed argument ba
. It is equivalent to apply_box (box f) ba
, but is more efficient.
box_apply2 f ba bb
applies the function f
to two boxed arguments: ba
and bb
. It is equivalent to apply_box (apply_box (box f) ba) bb
but it is more efficient.
bind_var x b
binds the variable x
in b
, producing a boxed binder.
bind_mvar xs b
binds the variables of xs
in b
to get a boxed binder. It is the equivalent of bind_var
to build a multiple binder.
box_binder f b
boxes the binder b
using the boxing function f
. Note that when b
is closed, it is immediately boxed using the box
function. In that case, the function f
is not used at all.
box_mbinder f b
boxes the multiple binder b
using the boxing function f
. Note that if b
is closed then it is immediately boxed (with box
), without relying on f
at all.
As mentioned earlier, terms with bound variables can only be built in the 'a box
type. To ease the construction of terms, it is a good practice to implement “smart constructors” at the 'a box
level. Coming back to our λ-calculus example, we can give the following smart constructors.
let var : term var -> term box =
fun x -> box_var x
let abs_raw : (term, term) binder box -> term box =
fun b -> box_apply (fun b -> Abs(b)) b
let abs : term var -> term box -> term box =
fun x t -> abs_raw (bind_var x t)
let app : term box -> term box -> term box =
fun t u -> box_apply2 (fun t u -> App(t,u)) t u
Additionally, it is a good idea to define a “boxing function”: a function used to turn terms into a boxed terms. It is sometimes necessary to do so, as we will see when we define a printing function.
let rec box_term : term -> term box = fun t ->
match t with
| Var(x) -> var x
| Abs(b) -> abs_raw (box_binder box_term b)
| App(t,u) -> app (box_term t) (box_term u)
val unbox : 'a box -> 'a
unbox e
can be called when the construction of a term is finished (i.e., when the desired variable bindings have been created).
We can then easily define terms of the λ-calculus as follows.
(* λx.x *)
let id : term =
let x = new_var "x" mkfree in
unbox (abs x (var x))
(* λx.λy.x *)
let fst : term =
let x = new_var "x" mkfree in
let y = new_var "y" mkfree in
unbox (abs x (abs y (var x)))
(* λx.(x) x) (boxed) *)
let delta : term box =
let x = new_var "x" mkfree in
abs x (app (var x) (var x))
(* (λx.(x) x) λx.(x) x *)
let omega : term =
unbox (app delta delta)
(* λx.(x) x) *)
let delta : term =
unbox delta
For variable substitution to be as fast as possible, the Bindlib
library does not do any work to maintain variable names at substitution time. This work is instead delayed until it becomes necessary: at the time of turning objects with binders into a textual representation (e.g., for printing the result of a computation). Such operations must hence maintain a context of variable names using the functions of this section.
val empty_ctxt : ctxt
empty_ctxt
denotes the empty context.
reserve_name name ctxt
extends context ctxt
by reserving variable name name
.
new_var_in ctxt mkfree name
is similar to new_var mkfree name
, but the name actually used for the newly created variable is chosen to not collide with the variables of context ctxt
. Said otherwise, argument name
only gives a preferred name: if it is not available then a fresh name is picked by appending a decimal number at the end of name
. Moreover, the obtained variable has a name that is safe for printing (see name_of
), at least as long as binders are not substituted in objects containing it. Finally, the context that is returned is extended to contain the new variable name.
new_mvar_in ctxt mkfree names
is similar to new_mvar mkfree names
, but it handles renaming based on context ctxt
(see new_var_in
).
unbind_in ctxt b
is similar to unbind b
, but it handles the context as explained in the documentation of new_mvar_in
. This function can be used for maintaining correct names in printing functions: it is safe to use the name_of
function on the returned variable (as long as no substitution is performed in the involved objects, see new_var_in
).
unbind2_in ctxt f g
is similar to unbind2 f g
, but handles the context as explained in the documentation of new_mvar_in
and unbind_in
.
unmbind_in ctxt b
is similar to unmbind b
, but it handles the context as is explained in the documentation of new_mvar_in
. As unbind_in
, the unmbind_in
function can be used to implement printing functions.
unmbind2_in ctxt f g
is similar to unmbind2 f g
, but it uses a context similrly to new_mvar_in
or unmbind_in
.
Going back to our λ-calculus example, the unbind_in
function can be used to implement the following function transforming a λ-term into a string
. Here, thanks to the use of unbind_in
, it is safe to rely on name_of
to print variables.
let to_string : ctxt -> term -> string = fun ctxt t ->
match t with
| Var(x) -> name_of x
| Abs(b) -> let (x,t,ctxt) = unbind_in ctxt b in
"λ" ^ name_of x ^ "." ^ to_string ctxt t
| App(t,u) -> "(" ^ to_string ctxt t ^ ") " ^ to_string ctxt u
The above to_string
must not only receive the term to display (i.e., the second argument), but also a context containing all free variables in the term. To avoid maintain such a context we can rely on function free_vars
together with the term boxing function box_term
defined earlier.
let to_string : term -> term = fun t ->
to_string (free_vars (box_term t)) t
In general, it is not difficult to use the box
and apply_box
functions to manipulate any kind of data in the 'a box
type. However, working with these functions alone can be tedious. The following functions can be used to manipulate standard data types in an optimised way.
box_rev_list bs
is similar to box_list bs
, but the produced boxed list is reversed (it is hence more efficient).
box_apply3
is similar to box_apply2
.
box_apply4
is similar to box_apply2
and box_apply3
.
box_pair ba bb
is the same as box_apply2 (fun a b -> (a,b)) ba bb
.
box_triple
is similar to box_pair
, but for triples.
module type Map = sig ... end
Type of a module equipped with a map
function.
Functorial interface used to build lifting functions for any type equipped with a map
function. In other words, this function can be used to allow the permutation of the 'a box
type with another type constructor.
module type Map2 = sig ... end
Type of a module equipped with a "binary" map
function.
Similar to the Lift
functor, but handles "binary" map
functions.
val hash_var : 'a var -> int
hash_var x
computes a hash for variable x
. Note that this function can be used with the Hashtbl
module.
compare_vars x y
safely compares x
and y
. Note that it is unsafe to compare variables using Stdlib.compare
.
eq_vars x y
safely computes the equality of x
and y
. Note that it is unsafe to compare variables with Stdlib.(=)
.
val binder_name : ( 'a, 'b ) binder -> string
binder_name b
returns the name of the variable bound by binder b
. This name is generally not safe for printing, since it is not updated after the binder is created, even when a substitution is performed (see unbind_in
and name_of
to learn how to print terms with binders).
val binder_occur : ( 'a, 'b ) binder -> bool
binder_occur b
returns a boolean indicating if the variable bound by b
occurs (i.e., is used). This is a constant time operation.
val binder_constant : ( 'a, 'b ) binder -> bool
binder_constant b
is the same as not (binder_occur b)
.
val binder_closed : ( 'a, 'b ) binder -> bool
binder_closed b
indicates whether the b
is closed (i.e., does not have any free variables). This is a constant time operation.
val binder_rank : ( 'a, 'b ) binder -> int
binder_rank b
gives the number of free variables contained in b
. This is a constant time operation.
val mbinder_arity : ( 'a, 'b ) mbinder -> int
mbinder_arity b
gives the arity of b
(i.e., the number of variables it is binding). This is a constant time operation.
val mbinder_names : ( 'a, 'b ) mbinder -> string array
mbinder_names b
returns the names of the variables bound by the multiple binder b
as an array. Similarly the result of binder_name
, these names are not generally safe for printing.
val mbinder_occurs : ( 'a, 'b ) mbinder -> bool array
mbinder_occurs b
returns an array of booleans indicating whether each of the variables that are bound occur (i.e., are used). It is a constant time operation.
val mbinder_constant : ( 'a, 'b ) mbinder -> bool
mbinder_constant b
indicates whether the mbinder
b
is constant. This means that none of its variables are used.
val mbinder_closed : ( 'a, 'b ) mbinder -> bool
mbinder_closed b
indicates whether the multiple binder b
is closed. It is a constant time operation.
val mbinder_rank : ( 'a, 'b ) mbinder -> int
mbinder_rank b
gives the number of free variables contained in b
. This is a constant time operation.
val is_closed : 'a box -> bool
is_closed b
checks whether b
is closed (in constant time).
occur x b
indicates whether variable x
occurs in b
. This is done in linear time with respect to the number of free variables in b
.
bind_apply bb barg
is the same as box_apply2 subst bb barg
.
mbind_apply bb bargs
is the same as box_apply2 msubst bb bargs
.
The variable renaming performed by functions like new_var_in
is somewhat arbitrary, and may not fit every application. As a consequence, we provide a functor Ctxt
that can be used to define a new renaming policy based on a custom notion of context and several configuration options.
module type Renaming = sig ... end
Module type giving the specification of a renaming policy, to be used with the Ctxt
functor.
The renaming policy used by the default context-manipulating function like new_var_in
or unbind_in
uses the following configuration: the value of reset_context_for_closed_terms = false
, skip_constant_binders = false
, and constant_binder_name = None
.
The default new_name
function is not exposed in the interface. It splits the name
into a pair (prefix,n)
containing the largest n
and longest prefix
such that name = prefix ^ string_of_int n
. For instance, if we have name = "xy023"
then (prefix,n) = ("xy0",23)
. In the case where no n
exists, we take (prefix,n) = (name,0)
. The prefix is preserved, and only the value of n
is changed for renaming.
A functor that can be used to obtain context-manipulating functions, given the specification of a renaming policy. The defined ctxt
type as well as the obtained functions can then be used as a drop-in replacement for their default counterparts (found at the top level of the Bindlib
module).
val uid_of : 'a var -> int
uid_of x
returns the unique identifier of the given variable.
val uids_of : 'a mvar -> int array
uids_of xs
returns the unique identifiers of the variables of xs
.
copy_var x mkfree name
makes a copy of variable x
, with a potentially different name and mkfree
function. However, the copy is treated exactly as the original in terms of binding and substitution. The main application of this function is for translating abstract syntax trees while preserving binders. In particular, variables at two different types should never live together (this may produce segmentation faults).
reset_counter ()
resets the unique identifier counter on which Bindlib
relies. This function should only be called when previously generated data (e.g., variables) cannot be accessed anymore.
val dummy_box : 'a box
dummy_box
can be used for initialising structures like arrays. Note that if unbox
is called on a data structure containing dummy_box
, then the exception Failure "Invalid use of dummy_box"
is raised.
binder_compose b f
postcomposes the binder b
with the function f
. In the process, the binding structure is not changed. Note that this function is not always safe. Use it with care.
mbinder_compose b f
postcomposes the multiple binder b
with f
. This function is similar to binder_compose
, and it is not always safe.
raw_binder name bind rank mkfree value
builds a binder using the value
function as its definition. The parameter name
correspond to a preferred name of the bound variable, the boolean bind
indicates whether the bound variable occurs, and rank
gives the number of distinct free variables in the produced binder. The mkfree
function injecting variables in the type 'a
of the domain of the binder must also be given. This function must be considered unsafe because it is the responsibility of the user to give the accurate value for bind
and rank
.
val raw_mbinder :
string array ->
bool array ->
int ->
( 'a var -> 'a ) ->
( 'a array -> 'b ) ->
( 'a, 'b ) mbinder
raw_mbinder names binds rank mk_free value
is similar to raw_binder
, but it is applied to a multiple binder. As for raw_binder
, this function has to be considered unsafe because the user must enforce invariants.