I produced a machine-checked proof that secp256k1_scalar_mul computes exactly r = a * b mod N for all representable inputs, where N is the secp256k1 group order. Together with all supporting helpers, the verified code totals roughly 300 lines of C. The proof covers arithmetic correctness, absence of integer overflow in all intermediate computations, and memory safety. Scalar multiplication sits on the critical path of every signature operation.
This is my first result under the OpenSats grant Advancing Formal Verification for Bitcoin Cryptography. The code and proofs are available at https://github.com/remix7531/secp256k1-scalar-fv-test.
Background
Memory-safe languages, testing, and fuzzing catch many bugs but cannot guarantee their absence. Formal verification uses mathematical proofs, checked by a computer, to show that code is correct for all possible inputs, not just the ones you happen to test. This can prove the absence of memory errors, correctness against a specification, and termination. This is especially valuable for cryptographic code, where the input space is so large that testing can only cover a negligible fraction of it. An empirical study of C/C++ cryptographic libraries found that 37.2% of vulnerabilities were memory-safety issues, more than the 27.2% that were cryptographic design flaws (Blessing et al. 2021).
The proof uses Rocq (formerly known as Coq), a proof assistant that checks proofs mechanically, removing the possibility of human error in the reasoning. The Verified Software Toolchain (VST) is the framework that connects the C code to its mathematical specification.
Verifying Modular Scalar Multiplication
The proof code across all Rocq files totals ~6,400 lines, giving a proof-to-code ratio of roughly 21:1, which is typical for this kind of verification. Scaling to larger parts of libsecp256k1 at this ratio would require significant proof engineering effort, though better automation should help bring it down. CompCert is a formally verified C compiler, so when the verified code is compiled with it, the correctness guarantee extends all the way down to assembly.
In libsecp256k1, a secp256k1_scalar is a 256-bit integer represented as four uint64_t limbs. The verified function is modular scalar multiplication over the secp256k1 group order:
void secp256k1_scalar_mul(secp256k1_scalar *r,
const secp256k1_scalar *a,
const secp256k1_scalar *b);
Internally, the function decomposes into two phases:
secp256k1_scalar_mul_512: a schoolbook 4x4-limb multiply producing a full 512-bit product from two 256-bit scalars.secp256k1_scalar_reduce_512: three-stage modular folding (512 to 385 to 258 to 256 bits) using the complementC = 2^256 - N, followed by a final conditional subtraction.
I verified the non-assembly, 64-bit path using the Verified Software Toolchain version 2.16. CompCert does not support __int128, so 128-bit arithmetic is emulated with a struct of two uint64_t limbs, matching the upstream fallback path. The verified code differs from upstream in two ways: assert statements are removed (not present in production builds) and macros are converted to inline functions to allow modular proofs. The algorithmic logic is unchanged.
Writing Specifications
A divide and conquer approach is essential for managing large proofs. The original code uses macros, which expand into a single monolithic function at compile time. By converting them to inline functions, each helper gets its own specification and proof, making the overall verification tractable.
Each helper function needs a precise specification. For example, here is the VST function contract for secp256k1_scalar_reduce (arithmetic operators are simplified for readability; the actual Rocq code uses prefix functions like Z.add):
Definition secp256k1_scalar_reduce_spec :=
DECLARE _secp256k1_scalar_reduce
WITH r_ptr : val, r : UInt256, overflow : Z, sh : share (* 1 *)
PRE [ tptr t_secp256k1_uint256, tuint ] (* 2 *)
PROP (writable_share sh; 0 <= overflow <= 2) (* 3 *)
PARAMS (r_ptr; Vint (Int.repr overflow)) (* 4 *)
SEP (data_at sh t_secp256k1_uint256 (uint256_to_val r) r_ptr) (* 5 *)
POST [ tint ] (* 6 *)
EX r' : UInt256, (* 7 *)
PROP (r' = (r + overflow * (2^256 - N)) mod 2^256) (* 8 *)
RETURN (Vint (Int.repr overflow)) (* 9 *)
SEP (data_at sh t_secp256k1_uint256 (uint256_to_val r') r_ptr). (*10 *)
- (1) binds the logical variables: a pointer
r_ptr, the mathematical valuerit points to, theoverflowcount, and a memory sharesh. - (2-5) are the precondition:
PROP(3) states logical constraints on the inputs.PARAMS(4) binds C function parameters to logical variables.SEP(5) describes the memory layout using separation logic (a formalism for reasoning about which parts of memory a function owns and can access).
- (6-10) are the postcondition:
EX r'(7) introduces the output value.PROP(8) states the mathematical relationship between input and output.RETURN(9) specifies the return value.SEP(10) describes the updated memory.
Getting the postcondition (8) right was the hardest part. I initially wrote it without mod 2^256:
- PROP (r' = r + overflow * (2^256 - N))
+ PROP (r' = (r + overflow * (2^256 - N)) mod 2^256)
The C code discards the final carry, so the result wraps modulo 2^256. Without mod, the spec was unprovable. Note that the code itself was correct, the error was in my specification. The proof assistant caught it because the proof cannot go through when the specification does not match the code’s actual behavior. My workflow was to specify all helper functions and admit their proofs (accept them without proof as temporary axioms), then work on increasingly harder proofs while keeping helpers of similar complexity admitted. After completing the main proof, I came back to the helpers and discovered the mistake. This is problematic in two ways: first, you spend hours trying to prove an unprovable lemma, and second, you have to revise all the lemmas that build on top of it.
I have not yet tested whether turning helper macros into inline functions affects performance. Modern compilers should inline them anyway, so I expect no difference.
Proof Automation
The current proof is 6,400 lines, but much of that is repetitive boilerplate. Better automation could reduce it. Paraphrasing one of my old academic advisors: “Formal verification of cryptography is (almost) a solved problem but still a huge engineering challenge. Research of recent years focuses on proof automation.”
Coming from F* (another proof assistant with built-in SMT solver support), I am used to more proof automation. Using VST in Rocq, I found myself proving trivial bounds over and over again. A recurring pattern in the secp256k1_reduce_512 proof was that lia could not resolve goals involving opaque constants. Adding these constants to the rep_lia hint database solved this:
Lemma N_C_0_eq : N_C_0 = 4624529908474429119. Proof. reflexivity. Qed.
Lemma N_C_1_eq : N_C_1 = 4994812053365940164. Proof. reflexivity. Qed.
Lemma N_C_2_eq : N_C_2 = 1. Proof. reflexivity. Qed.
#[export] Hint Rewrite N_C_0_eq N_C_1_eq N_C_2_eq : rep_lia.
This alone removed dozens of side goals that previously required manual rewrite and pose proof steps like:
(* muladd(&acc, n2, N_C_0) *)
forward_call (v_acc, acc_s1_2a, mkUInt64 n2 Hn2, mkUInt64 N_C_0 N_C_0_range, Tsh).
{ (* carry_s1_1 + l2 + n2 * NC0 < 2 ^ 192 *)
rewrite Hacc_s1_2a.
simpl u64_val.
pose proof (Hprod_NC0 n2 Hn2).
lia. }
I assume there are many more such opportunities. For the specifications I am using my own UInt type definitions, and adding more lemmas and encapsulating them in a dedicated module could help further.
CoqHammer and SMTCoq should be applicable to this kind of arithmetic reasoning, but I have not yet managed to configure them for VST proofs. Handling array access in VST is another area where I do not yet fully understand the underlying memory model, and I expect many of these proof steps can be simplified once I do.
Evaluation of VST
VST is usable for proving libsecp256k1 C code correct. The trusted computing base is small: VST’s soundness proof means that the semantics of Clight and Verified C are not trusted assumptions but part of the proof itself. CompCert is proven correct, so when compiling with CompCert the guarantee extends from C source to assembly. What remains in the trusted base is Rocq itself, OCaml extraction and compilation, and the machine language representation.
That said, there are practical limitations. VST operates on Clight, a subset of C defined by CompCert. Clight does not allow struct assignment. Fields have to be assigned separately. This does not affect the compiled output since the compiler optimizes it, but it does require modifying the source code. More problematic is the lack of struct return values, which secp256k1 uses in several places.
VST requires clightgen, a CompCert tool that parses C source code and extracts the Clight abstract syntax tree used by the proofs. CompCert is not free software. It is open source under the INRIA Non-Commercial License, which prohibits commercial use without a separate license. Most of the clightgen source files appear to be dual licensed under LGPL 2.1+, so it might be possible to create a “clightgen-libre” package, though some dependencies may not be covered.
Vision
If time and funding were unlimited, my vision would be similar to what the authors of Verified correctness and security of OpenSSL HMAC did. A full end-to-end formal verification of secp256k1’s BIP-340 Schnorr implementation, proving that no attacker can forge a signature (existential unforgeability under chosen message attack, EUF-CMA), with the guarantee extending from the mathematical definition all the way down to the compiled binary. This would start with a high level definition in Rocq proven secure with SSProve, followed by a refinement proof showing that the C code satisfies this specification using the Verified Software Toolchain or a similar tool, and finally, verification of constant-timeness of the resulting binary with tools such as BINSEC. This is a multi-year effort for a small team, and very ambitious for a single person. Then again, at the rate AI is improving at formal proofs, maybe an agent will do it before me.
Next Steps
On the framework side, I plan to port the existing scalar multiplication proof to RefinedC. RefinedC builds on Iris and introduces refined ownership types that automate much of the pointer arithmetic reasoning that is currently tedious in VST 2.x. VST 3.0 also builds on Iris and is integrating RefinedC, so the ecosystem is converging. Porting the scalar multiplication proof would give a direct comparison of both frameworks on the same verified code. I also hope to reuse the pure Z arithmetic lemmas, as those are framework-independent.
On the verification side, the next target is Pippenger’s algorithm for multi-scalar multiplication, which is used for batch verification of signatures. Pippenger depends on elliptic curve group operations (point addition, doubling, negation). Fully verifying those is out of scope for now, so I will axiomatize them and verify the algorithm itself assuming correct group operations. Finding the right specification in Rocq is an interesting challenge in itself.
I also plan to evaluate Cryptol/SAW in the coming months. Unlike VST and RefinedC, which are general-purpose C verification frameworks, Cryptol/SAW is specifically designed for cryptographic code verification.