Research

Modus Verify: verifier-guided RL for reasoning systems.

Modus Verify studies how formal verifiers, tests, reward signals, and curriculum design can improve model reasoning, proof generation, code generation, and agent behavior.

2% to 72%proof success in a controlled Lean-reward training run

Binary rewardformal verifier feedback instead of preference theatre

Reasoning depthtracking tactics, trajectories, regressions, and breakouts

Current focus

Research surfaces

Lean-based reward oracles
Code and proof generation
GRPO and reinforcement learning loops
Tactic diversity and reasoning depth
Curriculum design for reasoning tasks
Compute-to-intelligence dynamics

Questions

What we are trying to understand

How do models acquire reusable reasoning infrastructure?
When does RL produce brittle memorisation versus real capability?
How should theorem and code tasks be sequenced?
Can verifier feedback become a general training substrate for agents?