Research

Modus Verify: verifier-guided RL for reasoning systems.

Modus Verify studies how formal verifiers, tests, reward signals, and curriculum design can improve model reasoning, proof generation, code generation, and agent behavior.

2% to 72%proof success in a controlled Lean-reward training run
Binary rewardformal verifier feedback instead of preference theatre
Reasoning depthtracking tactics, trajectories, regressions, and breakouts

Current focus

Research surfaces

  • Lean-based reward oracles
  • Code and proof generation
  • GRPO and reinforcement learning loops
  • Tactic diversity and reasoning depth
  • Curriculum design for reasoning tasks
  • Compute-to-intelligence dynamics

Questions

What we are trying to understand

  • How do models acquire reusable reasoning infrastructure?
  • When does RL produce brittle memorisation versus real capability?
  • How should theorem and code tasks be sequenced?
  • Can verifier feedback become a general training substrate for agents?