Agent Tool-Call Validation: Verifying What Agents Actually Do
Why tool calls are the agent's weakest point
Composio's 2026 tool-calling guide puts it plainly: knowing which tool to call is trivial compared to the infrastructure required to call it successfully. That infrastructure is where most production agent failures live.
Three structural reasons.
The model picks the tool. If the model picks wrong, the tool runs anyway. The tool does not know whether it was the correct choice.
Arguments come from model output. LLMs produce strings that look like JSON. A schema check catches malformed structure, but schema-valid arguments can still be semantically wrong (right schema, wrong order ID).
Auth scope is usually broader than intent. The agent identity is authorized to call many tools. Whether this particular call is within the user's intent is a separate question that most stacks do not check.
Each of those is a point where the agent can do something unintended. Validation closes the gap.
Check 1: schema validation
The easy check and the one most teams already do.
On every tool call the model proposes, verify the call shape matches the declared schema: required arguments present, types correct, enum values in range, JSON well-formed. Reject anything malformed before it reaches the tool.
Tools and languages for this are mature. OpenAPI, JSON Schema, Zod, Pydantic. Microsoft's 2026 Foundry guidance covers structured tool-invocation schemas as the baseline.
What schema validation catches: malformed calls, type mismatches, missing arguments. What it misses: every semantic problem. A schema-valid call to the wrong tool passes.
Check 2: authorization scope
The agent has an identity. The identity has permissions. Every tool call should pass through a runtime authorization check: does this identity have scope for this tool with these arguments, in this session, for this user.
The 2026 pattern is just-in-time authorization. Arcade's 2026 writing and Microsoft Foundry both cover it. Briefly:
- Evaluate the user's identity, the agent's scope, and the requested action.
- Check session context (is this user allowed to act on this data right now).
- Credentials resolve only when needed and only for the narrow scope of the call.
Policy is declared per tool: can_call(tool, context) → allow | deny | review. The check runs at invocation time. If it fails, the call does not execute.
What scope validation catches: permission escapes, cross-tenant access, agents calling tools they should not be able to call. What it misses: calls that are within scope but wrong.
Check 3: semantic intent match
The hardest check and the one a sandbox is built for.
Even when a call is schema-valid and scope-authorized, it can still be wrong for the situation. The agent is supposed to look up Alice's order; it looks up Bob's order. Both are valid, scope is the same, schema matches. The call still fails the intent test.
Semantic intent checks ask: does this call advance the user's stated goal? The answer is not in the call itself. It is in the relationship between the goal, the current trace, and the call.
How to implement it:
- Extract the goal at session start (from the user prompt or the planner's initial output).
- For each tool call, compute a match score between the call and the goal given the trace state.
- Run low-score calls through a reviewer, either LLM-as-judge or human.
This check is expensive to run on every call at invocation time. That is why it usually runs inside a sandbox on replayed traffic, where compute is cheaper and the blast radius is zero.
Where each check runs
The three checks run at different points in the lifecycle.
- Schema: at CI time against a test set, and at invocation time as a lightweight guard.
- Scope: at invocation time always. Runtime is the only time this check matters.
- Intent: inside a sandbox on replayed traffic, pre-deploy. Optionally at runtime for high-risk tool classes with budget.
A sandbox like Paragon runs all three during a pre-deploy replay: the schema check confirms the call structure, the scope check confirms the agent had permission, and the intent check confirms the call advanced the user's goal. Violations get surfaced as specific sessions and specific calls, not aggregate scores.
Common mistakes
- Weak schema. Optional-everything schemas let the model send malformed calls that the schema layer accepts. Tighten the schema before blaming the model.
- Broad scope. Declaring the agent can call anything makes the scope check a no-op. Pattern the scope after the tightest per-user intent, not the broadest agent capability.
- No intent check at all. Most 2026 stacks stop at schema plus scope. That misses the class of failures where the call was technically valid and wrong.
- Only checking the first call. If you only validate the initial tool call in a trace, you miss chained errors introduced several steps in. Validate every call.
FAQ
Can I use MCP for this?
MCP covers tool declaration. Scope enforcement and intent checking are layered on top.
How do I write an intent-match scorer?
LLM-as-judge with user goal + trace + proposed call → confidence. Tune with examples. Paragon, Galileo, and Maxim all ship variants.
Does validation slow the agent down?
Schema + scope checks add a few ms. Intent checks are expensive and usually run async in sandbox replay, not at invocation time.
What happens when a check fails in production?
Schema fail: return a structured error the agent can retry. Scope fail: deny and log. Intent fail: human review or safer fallback.
If you want to start using Polarity, check out the docs.