tools/ — What Agents Can Do¶
Source:
kernel/tools/tools.py·kernel/tools/chain.py·kernel/tools/approval.py·kernel/tools/skills.py
Defines the complete tool taxonomy: three execution modes across two axes, wire declarations for LLM providers, sandboxed code chaining, human-in-the-loop approval, and prompt skills.
The Tool Taxonomy¶
Three Protocols govern all tools. The right one depends on who executes and who declares:
| Tool Protocol | Execution Mode | Key Fields | Core Execution Method | Examples |
|---|---|---|---|---|
Tool | Local | name, description, input_schema | execute(ctx, **kwargs) | WebSearchTool, CalculatorTool |
HostedTool | Provider | name, description, provider_specs: dict | Executed by provider | OpenAI web_search_preview, Anthropic web_search |
ProviderDefinedTool | Provider-declared, Local execution | name, description, provider_specs: dict, call_types | handle_call(call, ctx) | OpenAI computer_use, apply_patch |
Core Data & Specs¶
ToolExecutionResult: Represents execution outputs. Fields:content: list[ContentBlock]: The multimodal output blocks.is_error: bool: Flag indicating execution failure.structured_content: dict: Parsed dictionary response.
FunctionSpec: Local schema specification:name,description,parameters: dict,lazy_schema: bool.ProviderSpec: Provider-specific spec:name,provider: str,spec: dict.
Dispatch pattern at runtime¶
flowchart TD
LLM["LLM returns tool call"] --> CHECK1{"is_provider_defined_tool(tool)?"}
CHECK1 -->|"yes"| PROV_RUN["await tool.handle_call(call, ctx)<br/>→ ToolExecutionResult"]
CHECK1 -->|"no"| CHECK2{"is_hosted_tool(tool)?"}
CHECK2 -->|"yes"| SKIP["Skip local execution<br/>(Result arrives in next LLM turn)"]
CHECK2 -->|"no — plain Tool"| LOCAL_RUN["await tool.execute(ctx, **args)<br/>→ ToolExecutionResult"] Always check is_provider_defined_tool before is_hosted_tool — both have provider_specs, but only ProviderDefinedTool has handle_call.
ToolRisk and Approval¶
Tools declare a risk level. High and critical risk tools pause execution and ask a human before proceeding.
sequenceDiagram
autonumber
participant Agent
participant Invoker as ToolInvoker (L1)
participant Handler as ApprovalHandler
participant Human
Agent->>Invoker: execute tool "send_email" (ToolRisk.HIGH)
Invoker->>Invoker: check tool.risk
alt risk == SAFE
Invoker->>Invoker: execute immediately
else risk == HIGH or CRITICAL
Invoker->>Handler: request(ApprovalRequest)
Handler->>Human: "send_email wants to run. Approve?"
Human-->>Handler: decision
Handler-->>Invoker: ApprovalDecision
alt APPROVED
Invoker->>Invoker: execute tool
else DENIED
Invoker-->>Agent: ToolExecutionResult(is_error=True, "denied by user")
end
end
Invoker-->>Agent: ToolExecutionResult ApprovalRequest is immutable and fully serializable — it can be stored in Postgres and resumed after a restart. ApprovalHandler implementations: WebApprovalHandler (the ravi-ui HITL card), AutoApprovalHandler (tests), CliApprovalHandler (terminal).
Sandboxed Code-Mode Chaining¶
Tool chaining lets the LLM write a Python script that calls multiple tools and pipes results between them. The script runs in a Firecracker/K8s sandbox; each tool call crosses the bridge back to the framework-side ToolInvoker.
sequenceDiagram
autonumber
participant LLM
participant Chain as ToolChainTool (L2)
participant Sandbox as Firecracker Sandbox
participant Invoker as ToolInvoker (L1)
participant Tool
LLM->>Chain: execute(code="...")
Chain->>Sandbox: run(code, prelude_with_bridge)
loop Each tool call in script (max 50)
Sandbox->>Invoker: bridge call: {tool, args}
Invoker->>Tool: execute(ctx, **args)
Tool-->>Invoker: ToolExecutionResult
alt result <= 4096 bytes
Invoker-->>Sandbox: InvocationResult(text, structured)
else large result
Invoker->>Invoker: store in ArtifactStore
Invoker-->>Sandbox: InvocationResult(artifact_ref, preview)
end
end
Sandbox-->>Chain: ChainRunResult(output_text, call_trace)
Chain-->>LLM: ToolExecutionResult(content)
Note over Chain: call_trace lists every tool that ran<br/>even on crash — LLM avoids re-sending emails Key types from chain.py:
| Type | Purpose |
|---|---|
ChainPolicy | Limits: max_tool_calls=50, call_timeout_s=60, total_timeout_s=300, max_inline_result_bytes=4096 |
InvocationResult | What the sandbox receives back: status, text, structured, artifact_ref, files |
ChainRunResult | Final outcome: status, output_text, call_trace, duration_ms |
ChainCallRecord | One entry per bridged call in the trace: tool, args_digest, status, duration_ms |
Skills — Prompt Packages¶
A Skill behaves as a prompt package attached to an agent to inject custom instructions and constrain available tools:
| Field | Type | Description |
|---|---|---|
name | str | Unique skill name. |
instructions | str | Prompt text appended to the agent's system prompt during runs. |
description | str | Short explanation of the skill's purpose. |
allowed_tools | tuple[str, ...] | List of tool names that the agent is restricted to using while this skill is active. |
path | str \| None | Filepath if the skill was loaded dynamically from disk. |
version | str | Version string (e.g. '1.0.0'). |
A Skill extends an agent's behaviour without modifying its code. When attached to an agent, instructions are appended to the effective system prompt and allowed_tools limits which tools the skill can use.