Software Foundations Volume 3: Verified Functional Algorithms

2026/02/23

The following are my notes for Software Foundations Volume 3: Verified Functional Algorithms.

NOTE: These notes currently cover only the chapters Perm, Sort, SearchTree, and ADT. These are the recommended prerequisites for the Verifiable C volume.

The lia Tactic

When proving properties of sorting and searching algorithms, we constantly need to reason about inequalities on natural numbers. Doing this by hand is surprisingly tedious. Consider proving that k > i given i < j and ~ (k - 3 <= j). A manual proof requires hunting for the right transitivity lemmas, carefully converting between > and <, and dealing with the fact that subtraction on nat is truncated at zero (so k - 3 is not what you’d expect for small k). The proof runs to a dozen steps.

The lia tactic solves all of this automatically. It is a decision procedure for linear arithmetic over nat and Z, handling <, <=, =, >, >=, +, -, ~, and multiplication by constants, as well as some uses of \/, /\, and <->.

Theorem lia_example : forall i j k,
    i < j ->
    ~ (k - 3 <= j) ->
    k > i.
Proof.
  intros. lia.
Qed.

One thing to keep in mind: lia treats opaque function applications as uninterpreted variables. It can prove f x y > 7 * a -> f x y + 3 >= 7 * a the same way it proves u > v -> u + 3 >= v, because it never looks inside f x y. This means lia is only useful once the goal has been reduced to a purely arithmetic shape.

Reflection and bdestruct

Programs that sort or search over lists inevitably contain boolean comparisons like a >? b. When we want to prove properties about such programs, we quickly run into a mismatch: lia understands propositions like a > b, but the program branches on a bool. A plain destruct (a >? b) gives us cases a >? b = true and a >? b = false, not the propositional a > b and ~ a > b that lia can work with.

The standard library’s reflect type bridges this gap. A value of type reflect P b ties a proposition P to a boolean b: it is either ReflectT (carrying a proof of P, and only possible when b = true) or ReflectF (carrying a proof of ~ P, and only possible when b = false).

Inductive reflect (P : Prop) : bool -> Set :=
  | ReflectT :   P -> reflect P true
  | ReflectF : ~ P -> reflect P false.

For each comparison operator, the chapter proves a corresponding reflection lemma:

Lemma ltb_reflect : forall x y, reflect (x < y) (x <? y).
Lemma leb_reflect : forall x y, reflect (x <= y) (x <=? y).
Lemma gtb_reflect : forall x y, reflect (x > y) (x >? y).
Lemma geb_reflect : forall x y, reflect (x >= y) (x >=? y).
Lemma eqb_reflect : forall x y, reflect (x = y) (x =? y).

Destructing on ltb_reflect a 5 instead of (a <? 5) gives us a propositional hypothesis in each branch, which lia can immediately consume. This pattern is packaged into a single tactic called bdestruct:

Example reflect_example: forall a,
    (if a <? 5 then a else 2) < 6.
Proof.
  intros. bdestruct (a <? 5); lia.
Qed.

The first case gives H : a < 5 in the context, so lia proves a < 6. The second case gives H : ~ a < 5, and the goal reduces to 2 < 6, which lia also dispatches immediately. The rule of thumb is: whenever a proof gets stuck on a boolean comparison, use bdestruct followed by lia.

Specifying Correctness

Proving a program correct requires first writing down what “correct” means. This is harder than it sounds. The informal concept lives in your head, and translating it into a formal definition is not mechanical. As Alan Perlis quipped: “One can’t proceed from the informal to the formal by formal means.” Worse, the choice of definition has real consequences for how easy the proofs will be.

Take the property of a list being sorted. There are at least two reasonable formalizations:

(* Inductive: mirrors the recursive structure of lists *)
Inductive sorted : list nat -> Prop :=
| sorted_nil : sorted []
| sorted_1 : forall x, sorted [x]
| sorted_cons : forall x y l,
    x <= y -> sorted (y :: l) -> sorted (x :: y :: l).

(* Index-based: for any two positions i < j, element at i <= element at j *)
Definition sorted' (al : list nat) := forall i j iv jv,
    i < j ->
    nth_error al i = Some iv ->
    nth_error al j = Some jv ->
    iv <= jv.

Both definitions are logically equivalent and the equivalence can be proved (sorted al <-> sorted' al). But equivalence does not mean equal effort to work with. sorted is an inductive proposition, so proofs can do induction on the evidence that a list is sorted, with each constructor corresponding directly to one proof case. sorted' is a universally quantified statement with no inductive structure, so proofs require much more bookkeeping.

This becomes concrete when proving that insertion preserves sortedness. The proof for sorted takes a few lines, while the equivalent proof for sorted' requires auxiliary lemmas about nth_error and significantly more work. Same algorithm, same theorem, different specification.

When you are stuck in a proof, it is worth asking whether the definition, not the proof, is the problem.

Invariant

A binary search tree is not just any tree. It must satisfy an invariant: a property that must hold for every node in the tree, at every point during its lifetime. For BSTs the invariant is: for any node with key k, all keys in the left subtree are strictly less than k, and all keys in the right subtree are strictly greater. The lookup and insert functions silently produce wrong answers when this invariant is violated, with no error.

The invariant is not encoded in the tree type itself. Nothing stops manually constructing a tree that violates it. To make it a first-class object we can reason about, it is defined as a separate inductive proposition:

Inductive BST {V : Type} : tree V -> Prop :=
| BST_E : BST E
| BST_T : forall l x v r,
    ForallT (fun y _ => y < x) l ->
    ForallT (fun y _ => y > x) r ->
    BST l ->
    BST r ->
    BST (T l x v r).

ForallT P t asserts that predicate P holds at every node of t. This pattern, a plain inductive type for the shape paired with a separate proposition for the constraint, is a recurring idiom in verified data structures. We can then prove that empty_tree satisfies BST and that insert preserves it, so any tree built through the public interface is guaranteed to be valid.

Algebraic Specification

One approach to specifying correctness of lookup and insert is algebraic specification: instead of describing what the data structure is internally, we write equations describing how its operations behave relative to each other. For BSTs, three equations are sufficient:

(* Looking up any key in the empty tree returns the default value *)
lookup d k empty_tree = d

(* Looking up the key just inserted returns the inserted value *)
lookup d k (insert k v t) = v

(* Looking up a different key is unaffected by the insertion *)
lookup d k' (insert k v t) = lookup d k' t    (* when k <> k' *)

What is interesting is that the proofs of these three theorems do not depend on whether t satisfies the BST invariant at all. lookup and insert follow the same path through the tree regardless, so even a malformed tree behaves consistently under both operations.

Model-Based Specification

A second approach is model-based specification: relate the data structure to a simpler, already-understood abstract type and show that every operation corresponds to its counterpart on the abstract model. For BSTs the abstract model is the functional partial map from Volume 1. The connection is an abstraction function Abs that converts a tree into its corresponding map:

Definition Abs {V : Type} (t : tree V) : partial_map V :=
  map_of_list (elements t).

Correctness is then expressed as commutativity: it does not matter whether you operate on the concrete tree and then apply Abs, or apply Abs first and then operate on the abstract map. For example:

Abs (insert k v t) = update (Abs t) k v

find d k (Abs t) = lookup d k t

Any property already proved for functional maps transfers immediately to BSTs without reproving it. This is the main advantage over algebraic specification: if the abstract model has a rich theory, you inherit all of it for free.

Model-based specification requires more upfront work: you need to define the abstraction function and prove that every operation commutes with it. Algebraic specification is more direct. But I find model-based specification the cleaner approach. Connecting a concrete implementation to a well-understood abstract model feels like the right way to think about correctness, and the payoff in reusing an existing theory is real.

Modules and Functors

Rocq’s module system separates interfaces from implementations. A Module Type is the interface: it declares operations as Parameters and correctness properties as Axioms. A Module is the implementation: it provides concrete definitions and proves each axiom as a theorem. A parameterized Module that takes another module as input is called a functor:

Module MyTable (VT : ValType) <: Table.
  Definition V := VT.V.
  Definition default := VT.default.
  Definition table := key -> V.
  Definition get (k : key) (t : table) : V := t k.
  Definition set (k : key) (v : V) (t : table) : table :=
    fun k' => if k =? k' then v else t k'.
  (* ... proofs of the axioms ... *)
End MyTable.

The <: Table annotation tells Rocq to check that the module satisfies the Table interface. The same BST implementation can be wrapped in a functor where get and set delegate to lookup and insert, and the proofs of the three axioms reduce to lemmas already proved in SearchTree.

Encapsulation and Sharing Constraints

Using <: keeps the module transparent: everything inside is still visible to clients. A one-character change to : makes the module type opaque, hiding all implementation details. But full opacity hides too much. When a functor is instantiated on a concrete value type, the output type says nothing about V except that it exists, so Rocq rejects a statement like get 0 empty = "" because "" does not have the abstract type V.

The fix is a sharing constraint: you tell Rocq which definitions are to be identified across the module boundary:

Module MyEncapsulatedTable (VT : ValType)
  : (Table with Definition V := VT.V
           with Definition default := VT.default).

Now clients see what V and default concretely are, while the internal representation stays hidden. Proofs about the module must use only what the interface exposes, not the internal constructors.

Subset Types

A representation invariant says that not all values of a type are meaningful. The usual solution threads the invariant as an explicit precondition through every operation signature. Rocq offers a cleaner alternative. The type {x : A | P} contains exactly the values of type A that satisfy P. Constructing a value requires supplying a proof; no value of the type can ever violate the invariant.

Definition even_nat := {x : nat | Nat.Even x}.
Definition table := { t : tree V | BST t }.