What's New in LADR-2026

Summary of changes from LADR-2009-11A (November 2009) to LADR-2026-5A (March 2026).

Development by Jeffrey Machado and Larry Lesyna, 2026.

New Features

TPTP Input and TSTP Output (Prover9)

Prover9 now reads TPTP FOF/CNF (.p) files directly and produces SZS-compliant status output and TSTP-format proofs. No external conversion tools needed.

Auto-detected by .p file extension or -tptp flag.
SZS status: Theorem, Unsatisfiable, GaveUp, Timeout, MemoryOut, Error, User.
TSTP proof format with inference() records.
FOF flattening for GDV proof verifier compatibility.
Include directive support with TPTP path auto-detection.
Magic comments for embedding Prover9 options in .p files.
Proofs verified with both Ivy (ACL2) and GDV.

See TPTP Mode.

TPTP Input and Output (Mace4)

Mace4 now reads TPTP FOF/CNF (.p and .ax) files directly and produces SZS status output and TPTP-format finite interpretations.

Auto-detected by .p or .ax file extension or -tptp flag.
SZS status: Satisfiable, CounterSatisfiable, GaveUp, Timeout, MemoryOut, Error, User.
TPTP finite interpretation format (fi_domain, fi_functors, fi_predicates).
All existing Mace4 command-line options work in TPTP mode.
LADR-mode behavior completely unchanged.

See Mace4 TPTP Mode.

SInE Premise Selection

SInE (Sumo Inference Engine) premise selection filters large axiom sets to relevant subsets before the search begins.

Auto-enabled for TPTP problems with >128 axioms.
Disable with -nosine flag or assign(sine, 0).
Soft filter mode (≤5000 axioms): selected to SOS, unselected to usable. No axioms lost.
Depth-based mode (>5000 axioms): aggressive filtering with two-threshold tolerance for problems with tens of thousands of axioms.
Depth-weighted clause selection (sine_weight): prefer clauses from axioms closely related to the conjecture.
Hypothesis role seeding: TPTP hypothesis formulas seed the BFS.

See SInE Premise Selection.

ML Strategy Portfolio

Machine-learned strategy portfolio with 96 trained strategies (95 specialists + auto_default). Strategies are selected by a decision tree classifier based on syntactic features of the input problem.

27 tunable search parameters per strategy (ordering, weights, SInE, demodulation, resolution modes, penalty functions, inference rules, preprocessing, subsumption).
Decision tree classifies problems into leaf nodes with per-leaf strategy rankings.
Specialist strategies trained on problems auto_default fails on.
Portfolio generated by greedy set cover over a 1000-strategy grid.

See Strategy Scheduling.

Sliding-Window Scheduler

The -cores N flag enables a sliding-window scheduler that runs N concurrent children, each with a different ML strategy.

Sleep/wake scheduling via SIGSTOP/SIGCONT signals.
Two phases: Phase 1 (breadth, try many strategies), Phase 2 (depth, resume most promising children by priority score).
Shared-memory progress hints: children report stage, given clauses, kept clauses, and memory usage.
auto_default always runs in slot 0 and is never suspended.
Two fork modes: fork-before-SInE (>5000 axioms, per-child SInE diversity) and fork-after-clausify (≤5000 axioms, shared clause set).
Memory ejection: per-child budget in competition mode.

See Strategy Scheduling.

Multi-Ordering Trial

Tries 4 different symbol precedence orderings (frequency, reverse-frequency, arity-based, goal-directed), counts orientable equalities for each, and installs the best one. Auto-enabled in TPTP mode.

Checkpoint and Resume

Prover9 and Mace4 can save their search state to disk and resume later.

Manual checkpoints via SIGUSR2 signal.
Periodic auto-checkpoints with configurable interval and rotation.
Full state preservation: clause lists, indexing metadata, selector state, justifications, goal formulas.
Deterministic: post-checkpoint search matches exactly.

See Prover9 Checkpoint and Resume and Mace4 Checkpoint and Resume.

Proof Verification

Three independent verification methods for Prover9 proofs:

Ivy (ACL2-based): expands proofs to primitive steps, verified offline.
GDV: verifies TSTP proofs via external ATP on SystemOnTPTP.
tstp2ladr hints: re-derive external prover proofs (E, Vampire) in Prover9, then verify with Ivy.

See Proof Verification.

CNF Preprocessing Timeout

Wall-clock timeout during CNF conversion prevents Prover9 from spending the entire time limit on preprocessing of pathological formulas. Uses adaptive throttle with negligible overhead.

Predicate Elimination Timeout

5-second wall-clock timeout for predicate elimination. Enabled by the -fastPE flag or -comp mode. Prevents indefinite hangs on problems where the O(N²) subsumption loop in predicate elimination never finishes. Can also be enabled with: set(fast_pred_elim).

Mace4 `-ladr_out` Flag

The -ladr_out (or -ladr-out) flag forces Mace4 to produce LADR-format model output when reading TPTP input. This makes models compatible with interpformat, isofilter, and other LADR model tools.

Mace4 SIGALRM Timeout

Mace4 now enforces the -t time limit during the scan and clausify preprocessing phases using SIGALRM (wall clock). Previously, the timeout only took effect during search, allowing Mace4 to hang indefinitely on large inputs with expensive preprocessing.

Preprocessing Progress Reports

Optional progress reports to stderr during long preprocessing phases. Enabled with assign(report_preprocessing, N) where N is the interval in seconds.

Performance Improvements

Hash-Indexed FPA Tries

FPA (Fingerprint/Path-indexing for Atoms) trie nodes with 4+ children automatically use hash table lookup instead of linear linked list scan. Enabled by default. Runtime tunable: assign(fpa_hash_threshold, N). Benchmarks show 7–13% speedup on search-intensive problems.

Hint Match Statistics

When hints are used, basic match count statistics (matched/unmatched counts, min/mean/median/max of match counts) are printed automatically at the end of search. For detailed re-match delta histograms, enable with: set(hint_match_stats).

Iterative Algorithms

28 recursive functions converted to iterative with heap-allocated stacks, eliminating stack overflow on large inputs:

di_tree.c: feature vector indexing (depth up to 160k).
symbols.c, fpalist.c: symbol table operations (40k–174k symbols).
formula.c: 15 formula-walking functions (large TPTP formulas).
cnf.c: CNF conversion (deep formula nesting).
just.c: proof DAG traversal.
fpa.c: FPA query tree balancing.

Adaptive getrusage Throttle

Eliminated per-generated-clause getrusage() syscall overhead. Previously consumed 17.6% of CPU when max_seconds was disabled. Now uses adaptive throttle: check every N clauses, doubling N each time the check shows no time pressure.

Context Free List Optimization

Private Context free list bypasses palloc's get_cmem() zeroing for recycled Context structs. memset drops from 23.5% to 5.3% of CPU; generation rate +66%.

O(N²) Symbol Operations

Symbol gathering switched from O(N×K) linked lists to O(1) direct-indexed arrays. Symbol sorting switched from O(N²) insertion sort to O(N log N) qsort. Fixes 40+ minute preprocessing stalls with 174k symbols.

O(N²) TPTP Formula List Building

Formula list accumulation in TPTP parsing switched from plist_append (O(N) per call, O(N²) total) to plist_prepend + reverse_plist. HWV088-1 (345,281 formulas) parses in 0.21 seconds instead of 5+ minutes.

Release/Debug Build Separation

safe_malloc/safe_free tracking (8-byte size headers, global counters) gated behind #ifdef DEBUG. Release builds use plain malloc/free wrappers with retry logic. Zero tracking overhead in release.

Bug Fixes

Correctness

short overflow in discrim tree with 40k+ symbols. Changed short symbol to int.
short overflow in flatterm with symnum > 32767. Fixed flat demodulation corruption.
int overflow in interp.c multiplication (domain_size ^ arity).
Use-after-free in clash_recurse: undo_subst freed trail but stack entry still pointed to freed memory.
Back subsumption of limbo clauses: newly-added limbo clauses found as subsumees during back-subsumption of other limbo clauses.
Limbo clauses not literal-indexed on resume: invisible to hyper-resolution after checkpoint/resume.
Goal formulas missing from checkpoint: denial clauses referencing goal IDs caused fatal error on resume.
N-ary OR in distribute(): fixed 42 SWX domain errors.
Latent bug in renumber_variables(): initialized wrong array when max_vars > MAX_VARS.
Option value vs internal state: Prover9 max_megs option did not update palloc's internal limit.
Initialization ordering: SIGUSR1 handler accessed uninitialized globals.

Robustness

MAX_VARS raised from 200 to 500 (was too small for large TPTP).
MAX_QUANT_VARS made heap-allocated (was fixed 2000, too small for HWV domain).
MAX_MEM_LISTS raised from 500 to 2500 (for larger Context structs).
Default max_megs raised from 500 MB to 48 GB.
Mace4 cell table overflow fix for high-arity Skolem functions (e.g., f1/200 from KRS282+1).
macOS unsigned binary SIGKILL: ad-hoc code signing in Makefile.
fatal_error() prints SZS status Error in TPTP mode.

Build System

macOS universal binary support (arm64 + x86_64).
libtool instead of ar for static libraries on macOS.
Ad-hoc code signing (codesign -s -) to prevent macOS SIGKILL.
DEBUG=1 build mode with -DDEBUG flag.
Build-mode sentinel (.build_mode) detects debug/release mode switching and auto-cleans stale objects.
NATIVE=1 builds with -O3 -march=native -flto.
PGO (Profile-Guided Optimization) support: 11–20% CPU speedup.
test4 (TPTP tests) and test5 (SInE tests) added.

See Installation.

TPTP Benchmark Results

Full TPTP v9.2.0 FOF/CNF (13,374 problems, 60-second timeout):

6,176 PASS (46.2%).
3 domains swept 100%: PHI (9), HWC (4), HEN (65).

Prover9 rating 1.00 problems (13 solved):

Problem	Rating	Status	Time (-cores 8)
KLE122+1	1.00	Theorem	26.20s
ROB006-3	1.00	Unsatisfiable	6.32s
RNG027-10	1.00	Unsatisfiable	3.25s
LCL148-1	1.00	Unsatisfiable	7.25s
REL040+3	1.00	Theorem	1.43s
GEO525+1	1.00	Theorem	2.27s
LCL767-1	1.00	Unsatisfiable	0.08s
RNG010-1	1.00	Unsatisfiable	6.74s
RNG027-1	1.00	Unsatisfiable	2.23s
RNG028-1	1.00	Unsatisfiable	1.71s
SWV670-1	1.00	Unsatisfiable	4.82s
SWV671-1	1.00	Unsatisfiable	4.74s
SWV672-1	1.00	Unsatisfiable	4.76s

Times are Prover9 User_CPU seconds with -cores 8 on Apple M4 Max. The first 5 were solved by auto_default; the additional 8 were found by the ML strategy portfolio.

Mace4 rating 1.00 problems (2 solved):

Problem	Rating	Status
GRP735-1	1.00	Satisfiable (domain 15)
LDA040-1	1.00	Satisfiable

Next Section: Installation