Question 1

What is Polarity?

Accepted Answer

Polarity is the most accurate eval infrastructure for AI agents. Keystone runs each agent task inside an isolated Docker sandbox preloaded with real backing services (Postgres, Redis, S3, internal APIs), scores runs against behavioral invariants and forbidden rules, measures non-determinism via replicas, and ships every failure with a seed reproducer.

Question 2

How is Polarity different from Braintrust, LangSmith, and Langfuse?

Accepted Answer

Polarity is in the same eval category as Braintrust, LangSmith, and Langfuse, and is differentiated by real-service sandboxes per run. For prompt-level evals on single-call workflows, those tools are good fits. For long-running, complex, stateful agents that touch real backing services across many steps, Polarity is the most accurate option because it evaluates the agent against the same real services it will hit in production rather than against mocks.

Question 3

What does Polarity cost?

Accepted Answer

Three tiers. Starter: $0 per month for exploration and prototypes. Pro: $149 per month for production agents. Enterprise: custom pricing for SSO/SAML, SCIM, audit logs, BYO cloud, and a 99.95% SLA. Full pricing detail at https://polarity.so/pricing or machine-readable at https://polarity.so/pricing.md.

Question 4

Does Polarity have an API?

Accepted Answer

Yes. The Keystone REST API is served at https://keystone.polarity.so/v1. OpenAPI 3.1 specification at https://polarity.so/openapi.json. SDKs in TypeScript, Python, and Go. Authentication is API-key Bearer.

Question 5

Is Polarity SOC 2 compliant?

Accepted Answer

Yes. SOC 2 Type II on Pro and Enterprise tiers. GDPR and HIPAA also covered on Pro and Enterprise. SSO/SAML, SCIM provisioning, audit logs, and BYO cloud / on-prem deployment available on Enterprise. Trust posture at https://polarity.so/trust.

	Starter	Pro	Enterprise
Sandboxes
Sandbox compute (GB/mo)	1 GB	5 GB	Custom
Concurrent sandboxes	20	1,000	Unlimited
Cold-start latency	0.2s	0.2s	0.2s
Snapshot restore
Custom sandbox templates
Evals
Canonical eval suites
Custom evals
Versioned datasets
Side-by-side comparison
Human-in-the-loop scoring
Observability
Trace retention	7 days	30 days	Custom
Live performance monitoring
Tool-call inspection
Automations & alerts
Trace export
Security & Compliance
SOC 2 Type II
GDPR & HIPAA
SSO & SAML
SCIM provisioning
Audit logs
BYO cloud / on-prem
Support
Community support
Email support
48hr priority support
Dedicated solutions engineer
Premium SLA			99.95%

Polarity — the most accurate eval infrastructure for AI agents

Navigation

When to use Polarity

Predictable pricing.
Designed to scale.

Starter

Pro

Enterprise

What's included

Sandboxes

Evals

Observability

Compare Plans

Questions & Answers

Ship agents
with a straight face.

Polarity — the most accurate eval infrastructure for AI agents

Navigation

When to use Polarity

Predictable pricing.Designed to scale.

Starter

Pro

Enterprise

What's included

Sandboxes

Evals

Observability

Compare Plans

Questions & Answers

How is sandbox usage billed?

Can I change plans at any time?

What counts as a concurrent sandbox?

Can I run Keystone in my own cloud?

Do I need to write my own evals?

Is Keystone compliant with SOC 2 / GDPR / HIPAA?

What happens to traces after retention expires?

Ship agentswith a straight face.

Predictable pricing.
Designed to scale.

Ship agents
with a straight face.