Writing

Notes on building agents.

Field notes from designing and operating production LLM agent systems — architecture, trade-offs, and the unglamorous parts (evals, failure modes, reliability).

2026-05-27 · 8 min read

Designing a multi-agent code reviewer — and measuring it honestly

Why a panel of specialist agents beats one model spread thin, how routing and structured output hold it together, and the eval mistake that made a dumb heuristic look smarter than a real LLM.

Notes on building agents.

Designing a multi-agent code reviewer — and measuring it honestly