Clover Labs ships an agent-driven coding product where a single bad agent decision can cascade into hours of cleanup. Their engineering team needed an observability layer that watched how the agent made decisions — not just whether the final output passed an eval — and that could lock every detected failure into a guardrail before it shipped again.
The switch
“Switching to Polarity has been an incredible experience. It is fast, accurate and does more than the competitors. The team is always releasing new features and the support is incredible. I always hear back within the hour.”
— Anton, CTO at Clover Labs
How Clover uses Polarity
Every production trajectory flows through Polarity. The team instruments their coding agent with the Polarity SDK in a handful of lines, and decision-level telemetry streams to a workspace where engineers can replay any run locally.
When a regression slips through, the engineer pulls the offending trajectory with plr replay, fixes the prompt or the tool, then promotes the trajectory into a behavior with --promote-to-behavior. From that point on, CI gates every change against the behavior — the same regression cannot ship twice.
What changed
Across the first month of rollout, the team measured a 4.4 hour / week / engineer reduction in time spent triaging agent failures. Repeat regressions — failures the team had already fixed once, then watched re-emerge — went to zero.
“Support is incredible” isn’t something we usually quote, but it’s the part Anton brings up first. A sub-one-hour response from the Polarity team turns a 2-day firefight into a same-day fix.