Case studies

Proof-of-Work Writeups.

Short technical writeups for the Modus product systems: what was hard, what was built, what ran, and where each system points next.

Public product: Modus Verify

Modus Verify: verifier-guided RL for proof generation

Problem
Reasoning models need reward signals that track correctness, not surface fluency.
System
A Lean-backed training loop using verifier feedback as the reward signal for proof generation.
Technical Core
GRPO/RL over theorem tasks with Lean reward oracles, trajectory analysis, tactic monitoring, and curriculum design.
Evidence
A controlled run improved proof success from 2% to 72% using binary Lean verification reward.
Next Direction
Generalize verifier-guided learning from proofs into code-generation and agent-task environments.

Public product: Modus Memory

Modus Memory: local workstate timelines for agents

Problem
Agents lose context when work spans files, terminal output, screenshots, documents, and build state.
System
A local workstate layer that indexes artifacts and emits structured events from screens, OCR, terminals, builds, files, and processes.
Technical Core
Append-only event storage, artifact indexes, rollback-safe ledgers, OCR caches, local memory search, and bounded capture.
Evidence
The substrate already supports organizer scans, safe moves, OCR, memory search, screen events, and workstate ingestion paths.
Next Direction
Connect workstate timelines directly to agent evaluation, debugging, and task-generation loops.

Public product: Modus Sentinel

Modus Sentinel: watching and governing agent sessions

Problem
Long-running agents need observation, routing, failure capture, and governance around tool use and session state.
System
A harness that records events, routes them through modes, and supports review, coaching, and loop-worker missions.
Technical Core
Project-scoped watch hooks, event routing, mode queues, harness separation, failure logs, and conservative injection rules.
Evidence
The harness has been wired across Claude and Codex with verified project-scoped delivery and explicit watch semantics.
Next Direction
Productize the capability as agent evaluation infrastructure rather than exposing harness internals.

Product line: Modus Workbench

Pairling: the first Modus Workbench product

Problem
Local agents are powerful but hard to monitor and steer away from the machine running them.
System
A human-facing iPhone and Mac companion runtime for pairing, monitoring, and controlling local agent sessions.
Technical Core
Bonjour pairing, Mac runtime, iOS app, Watch and widget surfaces, live status, and local control flows.
Evidence
The product has app, Mac helper, release infrastructure, TestFlight work, and on-device validation history.
Next Direction
Position Pairling as the approachable app inside Modus Workbench, powered by the deeper evaluation and workstate platform.

Public product: Modus Capture

Modus Capture: meeting-to-agent capture

Problem
Important product and research intent is often trapped in meetings, voice notes, and transcripts.
System
Local-first capture that turns audio into transcripts, structured events, reports, and agent handoffs.
Technical Core
On-device recording, Whisper/MLX transcription paths, event storage, bridge receivers, and dispatcher workflows.
Evidence
The existing systems cover iOS capture, Mac bridge flows, local transcription, and high-speed remote transcription paths.
Next Direction
Fold meeting capture into the broader agent workstate and task-generation layer.