Proof Verification

Prover9 proofs can be independently verified using three methods:

  1. Ivy — Converts the proof to a detailed proof object where every inference step is made explicit. Checked by the ACL2-based Ivy proof checker. Works offline.
  2. GDV — Verifies TSTP-format proofs by checking each derived clause against its parents using an external ATP (typically E prover via SystemOnTPTP). Requires internet.
  3. tstp2ladr hints — Converts a proof from an external prover (E, Vampire) into Prover9 hints. Prover9 re-derives the proof, then the re-derived proof is verified by Ivy.

All three methods are independent of Prover9 and provide strong assurance that a proof is correct.

Method 1: Ivy Verification

Ivy is a proof checker developed by William McCune and Olga Shumsky, based on the ACL2 theorem prover. It verifies proofs by checking that every inference step is a valid application of one of a small set of primitive rules. Unlike GDV, Ivy checks syntactic correctness directly, making verification both fast and completely offline.

Workflow

Step 1: Run Prover9 with -ladr_out

prover9 -ladr_out -t 60 -f problem.p > proof.out

The -ladr_out flag produces native Prover9 output (LADR format) with full proof justifications. Without this flag, TPTP input automatically triggers TSTP output, which prooftrans cannot read.

Step 2: Convert to Ivy proof object

prooftrans ivy < proof.out > proof.ivy

This expands every Prover9 inference step into a sequence of primitive Ivy operations. A single hyper-resolution step, for example, becomes a chain of instantiate and resolve steps with explicit substitutions and literal positions.

Step 3: Verify with the Ivy checker

ivy-sources/util/checker proof.ivy

Expected output:

There are 1 proof objects.
The lengths of the proof objects are (44).
The result of the check is (T).
All proofs have been checked and are correct!

A result of (T) means the proof is correct. (NIL) means it failed.

Ivy Proof Object Format

The proof object is an s-expression. Each step has the form:

(id (rule ...) clause NIL)

The primitive rules are:

RuleDescription
inputAn input clause (axiom, assumption, or negated conjecture).
instantiateApply a substitution to a parent clause.
resolveBinary resolution at specified literal positions.
paramodParamodulation: replace a subterm using an equality.
flipFlip an equality (swap left and right sides of =).
propositionalPropositional simplification.

The final step in a refutation derives false (the empty clause).

Getting the Ivy Checker

The Ivy checker is available from:

Complete Example

# Run Prover9 with LADR output
prover9 -ladr_out -t 60 -f PUZ001+1.p > puz001.out

# Convert to Ivy proof object
prooftrans ivy < puz001.out > puz001.ivy

# Verify
ivy-sources/util/checker puz001.ivy
We are checking the proof objects in puz001.ivy.objs.
There are 1 proof objects.
The lengths of the proof objects are (44).
The result of the check is (T).
All proofs have been checked and are correct!

Method 2: GDV Verification

GDV (Global Derivation Verifier) is an independent proof verification tool from the TPTP project. It verifies TSTP-format proofs by checking each derived formula against its parents using calls to external ATP systems (typically E prover) on the SystemOnTPTP service at tptp.org.

Producing the TSTP Proof

There are two ways to get the TSTP proof for GDV:

Option A — Convert existing LADR output (recommended if you already have the proof):

prooftrans tstp < proof.out > proof.p

Option B — Run Prover9 directly (TSTP output is the default for TPTP input):

prover9 -t 60 -f problem.p > proof.p

Running GDV

ulimit -s 32768
GDV -q2 -t 60 proof.p

The ulimit is required — GDV has deep recursion that overflows the default 8 MB stack.

GDV verbosity levels:

FlagOutput
-q0Full output (all details)
-q1Results + ATP verification details
-q2Results only (recommended)
-q3SZS status line only

A successful verification prints:

SUCCESS: Derived formulae are verified
% SZS status Verified

Complete Example

prover9 -t 60 -f PUZ001+1.p > puz001_proof.p

ulimit -s 32768
GDV -q2 -t 60 puz001_proof.p
SUCCESS: Derivation has unique formula names
SUCCESS: All derived formulae have parents and inference information
SUCCESS: Derivation is acyclic
SUCCESS: Derivation looks like a refutation
SUCCESS: Negated conjecture c_26 is a leaf or CTH from a conjecture
SUCCESS: Assumptions are propagated
SUCCESS: Assumptions are discharged
SUCCESS: Leaf axiom(_like) formulae are satisfiable
SUCCESS: c_25 is a thm of c_24
...
SUCCESS: Derived formulae are verified
% SZS status Verified

Method 3: Verifying External Prover Proofs

E prover and Vampire can find proofs that Prover9 cannot, and vice versa. The tstp2ladr hints pipeline lets you verify proofs from any TSTP-capable prover using Ivy, by having Prover9 re-derive the proof with the external proof as guidance.

The idea: extract the clauses from the external proof, feed them to Prover9 as hints (preferred clauses), and let Prover9 find its own proof. Prover9's proof is then verified by Ivy. If Ivy accepts, the result is independently verified regardless of the original prover.

The tstp2ladr Tool

tstp2ladr [options] [-f <file>]

  -ladr      Output LADR-format proof (default)
  -hints     Output Prover9 input with proof clauses as hints
  -p <file>  Use original TPTP problem for sos/goals (recommended)
  -v         Verbose: print reconstruction details to stderr
  -f <file>  Read from file instead of stdin

The -p option is recommended — it uses the original TPTP problem for axioms and goals instead of extracting them from the proof. This avoids failures when the external prover's clausification diverges from Prover9's.

E Prover Workflow

# Step 1: Run E prover, convert proof to hints
eprover --cpu-limit=60 --tstp-format -s --proof-object problem.p \
    | tstp2ladr -hints -p problem.p > hints.in

# Step 2: Re-derive with Prover9
prover9 -ladr_out -t 60 -f problem.p hints.in > proof.out

# Step 3: Convert to Ivy proof object
prooftrans ivy < proof.out > proof.ivy

# Step 4: Verify with Ivy checker
ivy-sources/util/checker proof.ivy

Vampire Workflow

# Step 1: Run Vampire, convert proof to hints
vampire --proof tptp -t 60 problem.p \
    | tstp2ladr -hints -p problem.p > hints.in

# Steps 2-4: same as E prover workflow
prover9 -ladr_out -t 60 -f problem.p hints.in > proof.out
prooftrans ivy < proof.out > proof.ivy
ivy-sources/util/checker proof.ivy

Vampire's AVATAR splitting is handled automatically. AVATAR creates splitting predicates that flow through hints as regular formulas. There is no need to disable AVATAR.

When Re-derivation Fails

If Prover9 times out (cannot re-derive the proof using hints), try:

  1. Increase the time limit (-t 120 or higher).
  2. Use -p if not already using it.
  3. Try the other external prover (Vampire instead of E, or vice versa).
  4. Skip Ivy and use GDV on the external prover's TSTP output directly (GDV does not require re-derivation).

Verification Trust Chains

The trust chain for each method:

MethodTrust Chain
1 (Ivy)Prover9 → prooftrans → ACL2 checker
2 (GDV)Prover9 → GDV → E prover (SystemOnTPTP)
3 (Hints)External ATP → tstp2ladr → Prover9 → prooftrans → ACL2 checker

In Method 3, the external ATP is not trusted — its proof is only used as guidance. The verified proof is Prover9's independent re-derivation, checked by the same Ivy/ACL2 chain as Method 1.

Notes

FOF vs CNF problems
GDV works best with FOF problems. For CNF problems, GDV may fail to verify individual inference steps because the verifying ATP has difficulty reconstructing multi-nucleus hyper-resolution steps. Ivy does not have this limitation — it checks syntactic correctness directly.
The -ladr_out flag enables all prooftrans modes
Use -ladr_out to get LADR output that prooftrans can read. From one LADR output file, you can produce Ivy proof objects (prooftrans ivy), TSTP for GDV (prooftrans tstp), hints (prooftrans hints), and more.
GDV requires internet access
GDV calls remote ATP systems on SystemOnTPTP. It will not work offline.
Ivy works offline
Both prooftrans and the Ivy/ACL2 checker run entirely locally.
Combining methods
For maximum assurance, verify with both Ivy and GDV. Run Prover9 once with -ladr_out, then use prooftrans for both:
prover9 -ladr_out -t 60 -f problem.p > proof.out
prooftrans ivy  < proof.out > proof.ivy    # for Ivy
prooftrans tstp < proof.out > proof.p      # for GDV
There is no need to run Prover9 twice.

References

McCune, W. and Shumsky, O. "IVY: A Preprocessor and Proof Checker for First-Order Logic." In Computer-Aided Reasoning: ACL2 Case Studies, Kluwer Academic Publishers, 2000.

https://www.cs.unm.edu/~mccune/papers/ivy/

Next Section: Auto Modes