Proof Verification
Prover9 proofs can be independently verified using three methods:
- Ivy — Converts the proof to a detailed proof object where every inference step is made explicit. Checked by the ACL2-based Ivy proof checker. Works offline.
- GDV — Verifies TSTP-format proofs by checking each derived clause against its parents using an external ATP (typically E prover via SystemOnTPTP). Requires internet.
- tstp2ladr hints — Converts a proof from an external prover (E, Vampire) into Prover9 hints. Prover9 re-derives the proof, then the re-derived proof is verified by Ivy.
All three methods are independent of Prover9 and provide strong assurance that a proof is correct.
Method 1: Ivy Verification
Ivy is a proof checker developed by William McCune and Olga Shumsky, based on the ACL2 theorem prover. It verifies proofs by checking that every inference step is a valid application of one of a small set of primitive rules. Unlike GDV, Ivy checks syntactic correctness directly, making verification both fast and completely offline.
Workflow
Step 1: Run Prover9 with -ladr_out
prover9 -ladr_out -t 60 -f problem.p > proof.out
The -ladr_out flag produces native Prover9 output (LADR format) with full proof justifications. Without this flag, TPTP input automatically triggers TSTP output, which prooftrans cannot read.
Step 2: Convert to Ivy proof object
prooftrans ivy < proof.out > proof.ivy
This expands every Prover9 inference step into a sequence of primitive Ivy operations. A single hyper-resolution step, for example, becomes a chain of instantiate and resolve steps with explicit substitutions and literal positions.
Step 3: Verify with the Ivy checker
ivy-sources/util/checker proof.ivy
Expected output:
There are 1 proof objects. The lengths of the proof objects are (44). The result of the check is (T). All proofs have been checked and are correct!
A result of (T) means the proof is correct. (NIL) means it failed.
Ivy Proof Object Format
The proof object is an s-expression. Each step has the form:
(id (rule ...) clause NIL)
The primitive rules are:
| Rule | Description |
|---|---|
| input | An input clause (axiom, assumption, or negated conjecture). |
| instantiate | Apply a substitution to a parent clause. |
| resolve | Binary resolution at specified literal positions. |
| paramod | Paramodulation: replace a subterm using an equality. |
| flip | Flip an equality (swap left and right sides of =). |
| propositional | Propositional simplification. |
The final step in a refutation derives false (the empty clause).
Getting the Ivy Checker
The Ivy checker is available from:
- Standalone: https://www.cs.unm.edu/~mccune/papers/ivy/
- ACL2 books: Included in ACL2 under books/workshops/1999/ivy
- Online checker (no install required): https://www.cs.unm.edu/~mccune/ivy_check_prover9/ — upload a Prover9 output file and it runs prooftrans + checker automatically.
Complete Example
# Run Prover9 with LADR output prover9 -ladr_out -t 60 -f PUZ001+1.p > puz001.out # Convert to Ivy proof object prooftrans ivy < puz001.out > puz001.ivy # Verify ivy-sources/util/checker puz001.ivy
We are checking the proof objects in puz001.ivy.objs. There are 1 proof objects. The lengths of the proof objects are (44). The result of the check is (T). All proofs have been checked and are correct!
Method 2: GDV Verification
GDV (Global Derivation Verifier) is an independent proof verification tool from the TPTP project. It verifies TSTP-format proofs by checking each derived formula against its parents using calls to external ATP systems (typically E prover) on the SystemOnTPTP service at tptp.org.
Producing the TSTP Proof
There are two ways to get the TSTP proof for GDV:
Option A — Convert existing LADR output (recommended if you already have the proof):
prooftrans tstp < proof.out > proof.p
Option B — Run Prover9 directly (TSTP output is the default for TPTP input):
prover9 -t 60 -f problem.p > proof.p
Running GDV
ulimit -s 32768 GDV -q2 -t 60 proof.p
The ulimit is required — GDV has deep recursion that overflows the default 8 MB stack.
GDV verbosity levels:
| Flag | Output |
|---|---|
| -q0 | Full output (all details) |
| -q1 | Results + ATP verification details |
| -q2 | Results only (recommended) |
| -q3 | SZS status line only |
A successful verification prints:
SUCCESS: Derived formulae are verified % SZS status Verified
Complete Example
prover9 -t 60 -f PUZ001+1.p > puz001_proof.p ulimit -s 32768 GDV -q2 -t 60 puz001_proof.p
SUCCESS: Derivation has unique formula names SUCCESS: All derived formulae have parents and inference information SUCCESS: Derivation is acyclic SUCCESS: Derivation looks like a refutation SUCCESS: Negated conjecture c_26 is a leaf or CTH from a conjecture SUCCESS: Assumptions are propagated SUCCESS: Assumptions are discharged SUCCESS: Leaf axiom(_like) formulae are satisfiable SUCCESS: c_25 is a thm of c_24 ... SUCCESS: Derived formulae are verified % SZS status Verified
Method 3: Verifying External Prover Proofs
E prover and Vampire can find proofs that Prover9 cannot, and vice versa. The tstp2ladr hints pipeline lets you verify proofs from any TSTP-capable prover using Ivy, by having Prover9 re-derive the proof with the external proof as guidance.
The idea: extract the clauses from the external proof, feed them to Prover9 as hints (preferred clauses), and let Prover9 find its own proof. Prover9's proof is then verified by Ivy. If Ivy accepts, the result is independently verified regardless of the original prover.
The tstp2ladr Tool
tstp2ladr [options] [-f <file>] -ladr Output LADR-format proof (default) -hints Output Prover9 input with proof clauses as hints -p <file> Use original TPTP problem for sos/goals (recommended) -v Verbose: print reconstruction details to stderr -f <file> Read from file instead of stdin
The -p option is recommended — it uses the original TPTP problem for axioms and goals instead of extracting them from the proof. This avoids failures when the external prover's clausification diverges from Prover9's.
E Prover Workflow
# Step 1: Run E prover, convert proof to hints
eprover --cpu-limit=60 --tstp-format -s --proof-object problem.p \
| tstp2ladr -hints -p problem.p > hints.in
# Step 2: Re-derive with Prover9
prover9 -ladr_out -t 60 -f problem.p hints.in > proof.out
# Step 3: Convert to Ivy proof object
prooftrans ivy < proof.out > proof.ivy
# Step 4: Verify with Ivy checker
ivy-sources/util/checker proof.ivy
Vampire Workflow
# Step 1: Run Vampire, convert proof to hints
vampire --proof tptp -t 60 problem.p \
| tstp2ladr -hints -p problem.p > hints.in
# Steps 2-4: same as E prover workflow
prover9 -ladr_out -t 60 -f problem.p hints.in > proof.out
prooftrans ivy < proof.out > proof.ivy
ivy-sources/util/checker proof.ivy
Vampire's AVATAR splitting is handled automatically. AVATAR creates splitting predicates that flow through hints as regular formulas. There is no need to disable AVATAR.
When Re-derivation Fails
If Prover9 times out (cannot re-derive the proof using hints), try:
- Increase the time limit (-t 120 or higher).
- Use -p if not already using it.
- Try the other external prover (Vampire instead of E, or vice versa).
- Skip Ivy and use GDV on the external prover's TSTP output directly (GDV does not require re-derivation).
Verification Trust Chains
The trust chain for each method:
| Method | Trust Chain |
|---|---|
| 1 (Ivy) | Prover9 → prooftrans → ACL2 checker |
| 2 (GDV) | Prover9 → GDV → E prover (SystemOnTPTP) |
| 3 (Hints) | External ATP → tstp2ladr → Prover9 → prooftrans → ACL2 checker |
In Method 3, the external ATP is not trusted — its proof is only used as guidance. The verified proof is Prover9's independent re-derivation, checked by the same Ivy/ACL2 chain as Method 1.
Notes
- FOF vs CNF problems
- GDV works best with FOF problems. For CNF problems, GDV may fail to verify individual inference steps because the verifying ATP has difficulty reconstructing multi-nucleus hyper-resolution steps. Ivy does not have this limitation — it checks syntactic correctness directly.
- The -ladr_out flag enables all prooftrans modes
- Use -ladr_out to get LADR output that prooftrans can read. From one LADR output file, you can produce Ivy proof objects (prooftrans ivy), TSTP for GDV (prooftrans tstp), hints (prooftrans hints), and more.
- GDV requires internet access
- GDV calls remote ATP systems on SystemOnTPTP. It will not work offline.
- Ivy works offline
- Both prooftrans and the Ivy/ACL2 checker run entirely locally.
- Combining methods
- For maximum assurance, verify with both Ivy and GDV. Run Prover9
once with -ladr_out, then use prooftrans for both:
prover9 -ladr_out -t 60 -f problem.p > proof.out prooftrans ivy < proof.out > proof.ivy # for Ivy prooftrans tstp < proof.out > proof.p # for GDV
There is no need to run Prover9 twice.
References
McCune, W. and Shumsky, O. "IVY: A Preprocessor and Proof Checker for First-Order Logic." In Computer-Aided Reasoning: ACL2 Case Studies, Kluwer Academic Publishers, 2000.