The following are my notes for Software Foundations Volume 2: Programming Language Foundations.

NOTE: These notes currently only contain the first three chapters (Equiv, Hoare, Hoare2).

Equivalence of Programs

What does it mean for two programs to be equivalent? Two programs are behaviorally equivalent if, from any initial state, both diverge or both terminate in the same final state. Syntax is not what matters, behavior is. In Rocq, we write this as follows: for every st and st', c1 terminates from st in st' if and only if c2 also terminates from st in st'. Chapter Equiv proves that our definition is in fact a true equivalence relation by being reflexive, symmetric, and transitive.

Definition cequiv (c1 c2 : com) : Prop :=
  forall (st st' : state),
    (st =[ c1 ]=> st') <-> (st =[ c2 ]=> st').

To show that two programs are not equivalent, you typically give a counterexample: find an initial state where one program can terminate in some final state (or diverges) but the other cannot match that behavior. In practice this is often quite easy.

Behavioral equivalence also gives us a clean spec for compiler optimizations: a transformation is sound if it preserves behavior. For arithmetic expressions, equivalence means they evaluate to the same number in every program state. (In Imp, aexp is the syntax of arithmetic expressions: numbers, variables, and +/-/*.)

Definition aequiv (a1 a2 : aexp) : Prop :=
  forall st : state, aeval st a1 = aeval st a2.

(* Eliminate additions with zero: 0 + a2  ~~>  a2 *)
Fixpoint optimize_0plus (a : aexp) : aexp :=
  match a with
  | ANum n => ANum n
  | AId x => AId x
  | <{ 0 + a2 }> => optimize_0plus a2
  | <{ a1 + a2 }> => <{ (optimize_0plus a1) + (optimize_0plus a2) }>
  | <{ a1 - a2 }> => <{ (optimize_0plus a1) - (optimize_0plus a2) }>
  | <{ a1 * a2 }> => <{ (optimize_0plus a1) * (optimize_0plus a2) }>
  end.

The soundness statement is: for all a, aequiv a (optimize_0plus a).

Congruence

Less obviously, behavioral equivalence is also a congruence relation: the equivalence of two subprograms implies the equivalence of the larger programs in which they are embedded.

Main congruence rules (as proved in the chapter), where p = q means “p is behaviorally equivalent to q”:

Assignment: if a = a' then <{ x := a }> = <{ x := a' }>
Sequence: if c1 = c1' and c2 = c2' then <{ c1; c2 }> = <{ c1'; c2' }>
If: if b = b', c1 = c1', and c2 = c2' then <{ if b then c1 else c2 end }> = <{ if b' then c1' else c2' end }>
While: if b = b' and c = c' then <{ while b do c end }> = <{ while b' do c' end }>

Floyd–Hoare Logic

Floyd–Hoare logic (usually just Hoare logic) is a way to write specifications for imperative programs and then prove, in a structured way, that a program satisfies its spec.

The specs are assertions: logical predicates over program states.

Definition Assertion := state -> Prop.

Together with a command com, assertions form a Hoare triple {{P}} c {{Q}}, meaning: if P holds before running c and c terminates, then Q holds afterward.

Formally (as in the chapter):

Definition valid_hoare_triple (P : Assertion) (c : com) (Q : Assertion) : Prop :=
  forall st st', st =[ c ]=> st' -> P st -> Q st'.

Notation "{{ P }} c {{ Q }}" :=
  (valid_hoare_triple P c Q).

Example (swap X and Y using a temporary variable):

Definition T : string := "tmp".

Definition swap_XY : com :=
  <{ T := X;
     X := Y;
     Y := T }>.

Theorem swap_preserves_le :
  {{ X <= Y }}
    swap_XY
  {{ Y <= X }}.
Proof.
  unfold swap_XY.
  eapply hoare_seq.
  - eapply hoare_seq; apply hoare_asgn.
  - eapply hoare_consequence_pre.
    + apply hoare_asgn.
    + assertion_auto.
Qed.

Proof idea: apply the sequencing rule twice to break the program into three assignments. Each assignment is handled by hoare_asgn (which computes the weakest precondition by substitution). The last step uses hoare_consequence_pre plus assertion_auto to discharge the resulting arithmetic facts.

Rules of Consequence

Assertions form an implication order: one assertion can be stronger than another (it implies more). Hoare logic lets us use this to adjust specs without breaking validity.

Strengthen preconditions: if {{P'}} c {{Q}} is valid and P implies P', then {{P}} c {{Q}} is also valid.
Weaken postconditions: if {{P}} c {{Q'}} is valid and Q' implies Q, then {{P}} c {{Q}} is also valid.

This is important when combining proofs for parts of a command: the proof rules for sequences / conditionals often produce “almost right” pre/postconditions, and the consequence rules let you massage them into the shapes needed to compose everything.

Structured Proofs

The beauty of Hoare logic is that proofs are syntax-directed: the proof follows the shape of the program.

You can write this proof as an annotated program by placing assertions before and after each command. This is called a decorated program.

Most assertions come from the assignment rule. You typically work backwards from the postcondition, occasionally adding small implication steps. In Hoare2.v, a verification condition generator checks these steps, and the verify tactic can automatically solve many of the resulting goals.

Definition swap_XY' (m n : nat) : decorated :=
  <{
    {{ X = m /\ Y = n }}
    Z := X
    {{ X = m /\ Y = n /\ Z = m }};
    X := Y
    {{ X = n /\ Y = n /\ Z = m }};
    Y := Z
    {{ X = n /\ Y = m /\ Z = m }} ->>
    {{ X = n /\ Y = m }}
  }>.

Theorem swap_correct' : forall m n,
  outer_triple_valid (swap' m n).
Proof. verify. Qed.

Using erase_d, we can also prove that the annotated program is equivalent to the original:

Theorem swap_XY_equal : forall m n,
  erase_d (swap_XY' m n) = swap_XY.
Proof. reflexivity. Qed.

Loops are the main challenge. You must choose a loop invariant, which acts like an induction hypothesis:

The invariant must hold before the loop starts
It must remain true after each loop iteration
When the loop terminates, the invariant combined with the negated guard must imply the postcondition

Example: This program computes the integer square root of m. It initializes X to m and Z to 0. The loop maintains the invariant X = m /\ Z * Z <= m. The first conjunct preserves the input value. The second ensures the current Z is not too large. The guard (Z+1)*(Z+1) <= X checks whether the next square still fits. Each iteration increments Z, and the guard ensures Z * Z <= m is reestablished. When the loop terminates, the invariant combined with the negated guard yields the postcondition: Z * Z <= m and m < (Z+1)*(Z+1), meaning Z is the greatest integer whose square does not exceed m.

Definition sqrt (m: nat) : decorated :=
  <{
    {{ X = m }} ->>
    {{ X = m /\ 0 * 0 <= m }}
      Z := 0
                   {{ X = m /\ Z * Z <= m  }};
      while ((Z+1)*(Z+1) <= X) do
                   {{ X = m /\ Z * Z <= m /\ (Z + 1) * (Z + 1) <= X }} ->>
                   {{ X = m /\ (Z + 1) * (Z + 1) <= X }}
        Z := Z + 1
                   {{ X = m /\ Z * Z <= m }}
      end
    {{ X = m /\ Z * Z <= m /\ ~((Z + 1) * (Z + 1) <= X) }} ->>
    {{ Z * Z <= m /\ m < (Z + 1) * (Z + 1)}}
  }>.

Weakest Precondition

For any command c and postcondition Q, there exists a weakest precondition wp c Q: the most general assertion that guarantees Q after c terminates. Any other valid precondition P must be stronger (i.e., P implies wp c Q).

The weakest precondition can be computed algorithmically by working backward through the program:

Assignment: wp (X := a) Q is Q[X |-> a] (substitute a for X in Q)
Sequence: wp (c1; c2) Q is wp c1 (wp c2 Q)
Conditional: wp (if b then c1 else c2) Q is (b /\ wp c1 Q) \/ (~b /\ wp c2 Q)
Skip: wp skip Q is just Q

For loops, the weakest precondition is more complex and typically requires finding a loop invariant. In practice, we often work with sufficient (but not necessarily weakest) preconditions.

The assignment rule is fundamental: it shows that Hoare logic naturally computes weakest preconditions by substitution when working backward from postconditions.