Research

Jan 4, 2026
Paragon Expands: Full Testing Suite & Production Monitoring
Paragon transforms into a complete quality assurance platform with E2E, integration, unit, and performance testing plus production monitoring for uptime, dependencies, and infrastructure.

Dec 9, 2025
Omnigrep: State-of-the-Art Agentic Code Search
Omnigrep achieves state-of-the-art F0.5 of 0.475 on CodeSearchEval, outperforming Cognition's SWE-grep by 15% and Claude Code by 33% through multi-turn chain-of-thought reasoning.

Dec 5, 2025
Paragon E2E: Natural Language End-to-End Testing
Paragon now supports end-to-end testing through natural language. Describe what you want to test in plain English, and Paragon writes, runs, and iterates on Playwright tests.

Nov 3, 2025
ReviewBenchLite: A Benchmark for Evaluating Code Review Agents Capabilities with Production issues.
A benchmark for systematically evaluating proactive code review capabilities. Evaluates 117 real-world issues from 25 Python repositories across five categories. Results show specialized agents achieve up to 81.2% accuracy.

Oct 23, 2025
Introducing Paragon: The Next Generation of Autonomous Code Review
Deep dive into Paragon's architecture: how Polarity Heavy, Planner Agent, Worker Fleet, and Sandbox Verifier work together to achieve 94% accuracy and 6x faster execution than competitors.