Supervision & Budgets¶
The problem¶
The moment agents can spawn other agents, you have a new failure mode: a runaway tree. One agent delegates to three, each of those delegates to three more, and suddenly you have a fork bomb burning tokens and money with no ceiling. Even a single agent can loop forever or rack up an unbounded model bill.
Supervision is the org-chart that keeps a multi-agent run accountable: who reports to whom, how many agents may exist at once, and how much each is allowed to spend.
Two orthogonal budgets¶
Ravi separates "how many agents" from "how much each agent spends." They are enforced independently.
%%{init: {'theme': 'base', 'themeVariables': {'primaryColor': '#E8EAF6','primaryTextColor': '#1A237E','primaryBorderColor': '#3949AB','lineColor': '#546E7A','fontSize': '13px'}}}%%
graph TB
classDef budget fill:#FFF3E0,stroke:#E65100,color:#BF360C,font-weight:bold
classDef scope fill:#FAFAFA,stroke:#BDBDBD,color:#424242
SB["SpawnBudget<br/>headcount cap"]:::budget --- SB1["shared across the WHOLE run tree<br/>max_agents (default 50)"]:::scope
EB["ExecutionBudget<br/>resource cap"]:::budget --- EB1["per-agent: tokens, cost,<br/>turns, wall-clock deadline"]:::scope | Budget | Scope | Limits | Enforced by |
|---|---|---|---|
SpawnBudget | Run-wide — one shared instance for the whole tree | max_agents, allow_preempt | SpawnTracker (L1) |
ExecutionBudget | Per-agent — each agent has its own | max_tokens, max_cost_usd, max_turns, deadline_s | ExecutionTracker (L1) |
None on an ExecutionBudget field means unlimited for that dimension. Both budgets are frozen policy objects living in the kernel; the mutable counters that enforce them live in the agents layer.
Supervision: the org-chart¶
Every agent in a run carries a Supervision node — its position in the tree. It threads the shared identifiers and policy down from the root to every child:
%%{init: {'theme': 'base', 'themeVariables': {'primaryColor': '#E8EAF6','primaryTextColor': '#1A237E','primaryBorderColor': '#3949AB','lineColor': '#546E7A','fontSize': '13px'}}}%%
graph TD
classDef root fill:#E3F2FD,stroke:#1565C0,color:#0D47A1,font-weight:bold
classDef child fill:#E8EAF6,stroke:#3949AB,color:#1A237E
ROOT["root agent<br/>depth 0 · priority NORMAL"]:::root
ROOT --> A["researcher<br/>depth 1"]:::child
ROOT --> B["analyst<br/>depth 1"]:::child
A --> A1["sub-fetcher<br/>depth 2"]:::child
ROOT -.->|"shared run_id, session_id,<br/>SpawnBudget"| A
ROOT -.-> B
A -.-> A1 When the root spawns a child via spawn_child(), the child inherits:
- the same
run_id(one execution tree) andsession_id(one conversation), - the same
SpawnBudgetinstance — so the headcount cap is global, not per-branch, - the parent's
ExecutionBudgetby default (override to give a child tighter limits), depth + 1(informational, for UI nesting; there is no depth limit —SpawnBudgetis the single structural constraint).
A handy detail: all agents in a run publish progress to one topic, TopicId("agent.progress", run_id), so a UI subscribes once and sees the whole tree.
Priority and preemption¶
Not all branches are equal. Each agent has a Priority — an integer weight used for proportional pool allocation:
| Priority | Weight |
|---|---|
BACKGROUND | 0 (best-effort) |
LOW | 1 |
NORMAL | 2 (default) |
HIGH | 4 |
CRITICAL | 8 |
When the headcount cap is reached and allow_preempt is on, a HIGH/CRITICAL agent that needs a slot can cooperatively pause a lower-priority agent to claim it, instead of being denied outright. The SpawnTracker manages this — tracking the current count and the paused set.
%%{init: {'theme': 'base', 'themeVariables': {'primaryColor': '#E3F2FD','primaryTextColor': '#0D47A1','primaryBorderColor': '#1565C0','lineColor': '#546E7A','fontSize': '13px'}}}%%
flowchart TD
classDef decision fill:#FFF3E0,stroke:#E65100,color:#BF360C,font-weight:bold
classDef ok fill:#E8F5E9,stroke:#2E7D32,color:#1B5E20,font-weight:bold
classDef wait fill:#E8EAF6,stroke:#3949AB,color:#1A237E
classDef deny fill:#FFEBEE,stroke:#C62828,color:#B71C1C,font-weight:bold
SPAWN["agent wants to spawn a child"] --> CAP{"headcount < max_agents?"}:::decision
CAP -->|"yes"| GO["acquire slot, spawn"]:::ok
CAP -->|"no"| PRE{"allow_preempt and<br/>child priority is higher?"}:::decision
PRE -->|"yes"| PAUSE["pause a lower-priority agent,<br/>take its slot"]:::wait
PRE -->|"no"| DENY["SpawnDenied"]:::deny How the orchestrator uses it¶
OrchestratorAgent is the concrete consumer. It holds a roster of sub-agents, exposes each as a delegation tool, and when the model delegates it:
spawn_tracker.acquire(child_id, priority)— claim a headcount slot (or preempt).ctx.spawn(child_id, boot=msg)— start the child run.ctx.ask(handle, msg, timeout=…)— suspend until the child replies, fails, or times out.spawn_tracker.release(child_id)in afinally— always give the slot back.
Meanwhile each agent's ExecutionTracker is wired into its ReAct loop, calling consume(tokens=…, turns=1) after every model call and raising BudgetExhaustedError the moment a per-agent limit is breached. The deadline_s is enforced by RunContext.check(), which every loop iteration calls.
Putting it together¶
from ravi.kernel.agent.supervision import SpawnBudget, ExecutionBudget, Priority
orchestrator = OrchestratorAgent(
"lead",
model=model,
sub_agents=[
SubAgentConfig(agent=researcher, priority=Priority.HIGH, ask_timeout=120),
SubAgentConfig(agent=analyst, priority=Priority.NORMAL, ask_timeout=120),
],
spawn_budget=SpawnBudget(max_agents=10, allow_preempt=True),
)
The whole tree can spawn at most 10 agents; researcher outranks analyst for the last slot; each delegation waits at most 120 s for a reply.
Where this lives¶
| Piece | Location |
|---|---|
Supervision, SpawnBudget, ExecutionBudget, Priority | kernel/agent/supervision.py |
SpawnTracker (headcount + preemption) | agents/supervision/budget.py |
ExecutionTracker (per-agent spend) | agents/resources/budget.py |
OrchestratorAgent, SubAgentConfig | agents/core/orchestrator.py |
BudgetExhaustedError | kernel/core/errors.py |
Next: Hooks — observe the run loop without changing it.